You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mário André <ma...@infonet.com.br> on 2009/12/30 18:18:15 UTC

Question about TokenStream lucene 3.0

Hi,
I have the method below:

    public final TokenStream tokenStream(String fieldName, Reader reader)
    {
        TokenStream result = new LowerCaseTokenizer(reader);
        result = new StopFilter(true, result, stopWords, true);
        result = new PorterStemFilter(result);
        return result;        
    }

Using PorterStemFilter and removing the stopwords, but how can I use
TokenStream in release 3.0 (print the result this method).

I tried to use:

    public static void main(String[] args) throws IOException,
ParseException 
    {
      StringReader sr = new StringReader("The man is very good. He talked
about many thigs");
      PorterStemAnalyzer ps = new PorterStemAnalyzer();
      TokenStream tokenstream = ps.tokenStream(null, sr);

      //Tokenizer token = (Tokenizer) ps.tokenStream(null, sr);
      while(tokenstream.incrementToken())
      {
          ????????
      }
      
    }

Thanks.

---------------------------------------------------------------------
Mário André
Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS
Mestrando em MCC - Universidade Federal de Alagoas - UFAL
http://www.marioandre.com.br/
Skype: mario-fa
----------------------------------------------------------------------------
----------




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: RES: Question about TokenStream lucene 3.0

Posted by AHMET ARSLAN <io...@yahoo.com>.
> System.out.println(typeAtt.type());
> ??? And this typeAtt?
> 
> Thanks!
> 

Yes. You can add the other attributes if you want. By the way i forget to remove (TermAttribute) and TypeAttribute). You don't need them in 3.0.0.

TermAttribute termAtt = tokenStream.getAttribute(TermAttribute.class);
TypeAttribute typeAtt = tokenStream.getAttribute(TypeAttribute.class);


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RES: Question about TokenStream lucene 3.0

Posted by Mário André <ma...@infonet.com.br>.
System.out.println(typeAtt.type()); ??? And this typeAtt?

Thanks!

---------------------------------------------------------------------
Mário André
Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS
Mestrando em MCC - Universidade Federal de Alagoas - UFAL
http://www.marioandre.com.br/
Skype: mario-fa
----------------------------------------------------------------------------
----------


-----Mensagem original-----
De: AHMET ARSLAN [mailto:iorixxx@yahoo.com] 
Enviada em: quarta-feira, 30 de dezembro de 2009 15:26
Para: java-user@lucene.apache.org
Assunto: Re: Question about TokenStream lucene 3.0


> Using PorterStemFilter and removing the stopwords, but how
> can I use
> TokenStream in release 3.0 (print the result this method).
> 
> I tried to use:
> 
>     public static void main(String[] args) throws
> IOException,
> ParseException 
>     {
>       StringReader sr = new
> StringReader("The man is very good. He talked
> about many thigs");
>       PorterStemAnalyzer ps = new
> PorterStemAnalyzer();
>       TokenStream tokenstream =
> ps.tokenStream(null, sr);
> 
>       //Tokenizer token = (Tokenizer)
> ps.tokenStream(null, sr);
>       while(tokenstream.incrementToken())
>       {
>           ????????
>       }
>       
>     }
> 
> Thanks.

You can use this method to display:

   public static void displayTokenStream(TokenStream tokenStream) throws
IOException {

        TermAttribute termAtt = (TermAttribute)
tokenStream.getAttribute(TermAttribute.class);
        TypeAttribute typeAtt = (TypeAttribute)
tokenStream.getAttribute(TypeAttribute.class);

        while (tokenStream.incrementToken()) {
            System.out.print(termAtt.term());
            System.out.print(' ');
            System.out.println(typeAtt.type());
        }
    }


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Question about TokenStream lucene 3.0

Posted by AHMET ARSLAN <io...@yahoo.com>.
> Using PorterStemFilter and removing the stopwords, but how
> can I use
> TokenStream in release 3.0 (print the result this method).
> 
> I tried to use:
> 
>     public static void main(String[] args) throws
> IOException,
> ParseException 
>     {
>       StringReader sr = new
> StringReader("The man is very good. He talked
> about many thigs");
>       PorterStemAnalyzer ps = new
> PorterStemAnalyzer();
>       TokenStream tokenstream =
> ps.tokenStream(null, sr);
> 
>       //Tokenizer token = (Tokenizer)
> ps.tokenStream(null, sr);
>       while(tokenstream.incrementToken())
>       {
>           ????????
>       }
>       
>     }
> 
> Thanks.

You can use this method to display:

   public static void displayTokenStream(TokenStream tokenStream) throws IOException {

        TermAttribute termAtt = (TermAttribute) tokenStream.getAttribute(TermAttribute.class);
        TypeAttribute typeAtt = (TypeAttribute) tokenStream.getAttribute(TypeAttribute.class);

        while (tokenStream.incrementToken()) {
            System.out.print(termAtt.term());
            System.out.print(' ');
            System.out.println(typeAtt.type());
        }
    }


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org