You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by juniol <mr...@hotmail.fr> on 2010/03/22 18:37:22 UTC

how to filter numeric values?

hello;

i want to filter my tokens and keep only string tokens ( remove numbers
ect).
i sue this :

public TokenStream tokenStream(String fieldName, Reader reader) {
    return new PorterStemFilter(
      new StopFilter(
        new LowerCaseFilter(
          new StandardFilter(
            new StandardTokenizer(reader))), stopset));
  }
  


thanks

-- 
View this message in context: http://old.nabble.com/how-to-filter-numeric-values--tp27989882p27989882.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: how to filter numeric values?

Posted by Erick Erickson <er...@gmail.com>.

Why not just use SimpleAnalyzer? From the javadocs:
An Analyzer<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/Analyzer.html>
that
filters LetterTokenizer<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/LetterTokenizer.html>
 with LowerCaseFilter<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/LowerCaseFilter.html>

<http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/analysis/LowerCaseFilter.html>
Erick

On Mon, Mar 22, 2010 at 2:09 PM, juniol <mr...@hotmail.fr> wrote:

>
> hello thanks about the reply
> i found another solution:
>
> StopAnalyzer std1 = new StopAnalyzer(Version.LUCENE_CURRENT);
>     PorterStemFilter std =new PorterStemFilter(std1.tokenStream("field",
> reader));
>
>
>
>
> juniol wrote:
> >
> > hello;
> >
> > i want to filter my tokens and keep only string tokens ( remove numbers
> > ect).
> > i use this :
> >
> > public TokenStream tokenStream(String fieldName, Reader reader) {
> >     return new PorterStemFilter(
> >       new StopFilter(
> >         new LowerCaseFilter(
> >           new StandardFilter(
> >             new StandardTokenizer(reader))), stopset));
> >   }
> >
> >
> >
> > thanks
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/how-to-filter-numeric-values--tp27989882p27990352.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: how to filter numeric values?

Posted by juniol <mr...@hotmail.fr>.

hello thanks about the reply 
i found another solution:

StopAnalyzer std1 = new StopAnalyzer(Version.LUCENE_CURRENT);
     PorterStemFilter std =new PorterStemFilter(std1.tokenStream("field",
reader));




juniol wrote:
> 
> hello;
> 
> i want to filter my tokens and keep only string tokens ( remove numbers
> ect).
> i use this :
> 
> public TokenStream tokenStream(String fieldName, Reader reader) {
>     return new PorterStemFilter(
>       new StopFilter(
>         new LowerCaseFilter(
>           new StandardFilter(
>             new StandardTokenizer(reader))), stopset));
>   }
>   
> 
> 
> thanks
> 
> 

-- 
View this message in context: http://old.nabble.com/how-to-filter-numeric-values--tp27989882p27990352.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: how to filter numeric values?

Posted by Ahmet Arslan <io...@yahoo.com>.


> hello;
> 
> i want to filter my tokens and keep only string tokens (
> remove numbers
> ect).
> i sue this :
> 
> public TokenStream tokenStream(String fieldName, Reader
> reader) {
>     return new PorterStemFilter(
>       new StopFilter(
>         new LowerCaseFilter(
>           new StandardFilter(
>             new
> StandardTokenizer(reader))), stopset));
>   }


Why not use LowerCaseTokenizer [1] instead of StandardTokenizer + StandardFilter +  LowerCaseFilter. 

[1]http://lucene.apache.org/java/2_9_2/api/core/org/apache/lucene/analysis/LowerCaseTokenizer.html


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org