You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by juniol <mr...@hotmail.fr> on 2010/03/22 18:37:22 UTC
how to filter numeric values?
hello;
i want to filter my tokens and keep only string tokens ( remove numbers
ect).
i sue this :
public TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(
new StopFilter(
new LowerCaseFilter(
new StandardFilter(
new StandardTokenizer(reader))), stopset));
}
thanks
--
View this message in context: http://old.nabble.com/how-to-filter-numeric-values--tp27989882p27989882.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: how to filter numeric values?
Posted by Erick Erickson <er...@gmail.com>.
Why not just use SimpleAnalyzer? From the javadocs:
An Analyzer<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/Analyzer.html>
that
filters LetterTokenizer<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/LetterTokenizer.html>
with LowerCaseFilter<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/LowerCaseFilter.html>
<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/LowerCaseFilter.html>
Erick
On Mon, Mar 22, 2010 at 2:09 PM, juniol <mr...@hotmail.fr> wrote:
>
> hello thanks about the reply
> i found another solution:
>
> StopAnalyzer std1 = new StopAnalyzer(Version.LUCENE_CURRENT);
> PorterStemFilter std =new PorterStemFilter(std1.tokenStream("field",
> reader));
>
>
>
>
> juniol wrote:
> >
> > hello;
> >
> > i want to filter my tokens and keep only string tokens ( remove numbers
> > ect).
> > i use this :
> >
> > public TokenStream tokenStream(String fieldName, Reader reader) {
> > return new PorterStemFilter(
> > new StopFilter(
> > new LowerCaseFilter(
> > new StandardFilter(
> > new StandardTokenizer(reader))), stopset));
> > }
> >
> >
> >
> > thanks
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/how-to-filter-numeric-values--tp27989882p27990352.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: how to filter numeric values?
Posted by juniol <mr...@hotmail.fr>.
hello thanks about the reply
i found another solution:
StopAnalyzer std1 = new StopAnalyzer(Version.LUCENE_CURRENT);
PorterStemFilter std =new PorterStemFilter(std1.tokenStream("field",
reader));
juniol wrote:
>
> hello;
>
> i want to filter my tokens and keep only string tokens ( remove numbers
> ect).
> i use this :
>
> public TokenStream tokenStream(String fieldName, Reader reader) {
> return new PorterStemFilter(
> new StopFilter(
> new LowerCaseFilter(
> new StandardFilter(
> new StandardTokenizer(reader))), stopset));
> }
>
>
>
> thanks
>
>
--
View this message in context: http://old.nabble.com/how-to-filter-numeric-values--tp27989882p27990352.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: how to filter numeric values?
Posted by Ahmet Arslan <io...@yahoo.com>.
> hello;
>
> i want to filter my tokens and keep only string tokens (
> remove numbers
> ect).
> i sue this :
>
> public TokenStream tokenStream(String fieldName, Reader
> reader) {
> return new PorterStemFilter(
> new StopFilter(
> new LowerCaseFilter(
> new StandardFilter(
> new
> StandardTokenizer(reader))), stopset));
> }
Why not use LowerCaseTokenizer [1] instead of StandardTokenizer + StandardFilter + LowerCaseFilter.
[1]http://lucene.apache.org/java/2_9_2/api/core/org/apache/lucene/analysis/LowerCaseTokenizer.html
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org