You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mário André <ma...@infonet.com.br> on 2009/12/30 18:18:15 UTC
Question about TokenStream lucene 3.0
Hi,
I have the method below:
public final TokenStream tokenStream(String fieldName, Reader reader)
{
TokenStream result = new LowerCaseTokenizer(reader);
result = new StopFilter(true, result, stopWords, true);
result = new PorterStemFilter(result);
return result;
}
Using PorterStemFilter and removing the stopwords, but how can I use
TokenStream in release 3.0 (print the result this method).
I tried to use:
public static void main(String[] args) throws IOException,
ParseException
{
StringReader sr = new StringReader("The man is very good. He talked
about many thigs");
PorterStemAnalyzer ps = new PorterStemAnalyzer();
TokenStream tokenstream = ps.tokenStream(null, sr);
//Tokenizer token = (Tokenizer) ps.tokenStream(null, sr);
while(tokenstream.incrementToken())
{
????????
}
}
Thanks.
---------------------------------------------------------------------
Mário André
Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS
Mestrando em MCC - Universidade Federal de Alagoas - UFAL
http://www.marioandre.com.br/
Skype: mario-fa
----------------------------------------------------------------------------
----------
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: RES: Question about TokenStream lucene 3.0
Posted by AHMET ARSLAN <io...@yahoo.com>.
> System.out.println(typeAtt.type());
> ??? And this typeAtt?
>
> Thanks!
>
Yes. You can add the other attributes if you want. By the way i forget to remove (TermAttribute) and TypeAttribute). You don't need them in 3.0.0.
TermAttribute termAtt = tokenStream.getAttribute(TermAttribute.class);
TypeAttribute typeAtt = tokenStream.getAttribute(TypeAttribute.class);
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RES: Question about TokenStream lucene 3.0
Posted by Mário André <ma...@infonet.com.br>.
System.out.println(typeAtt.type()); ??? And this typeAtt?
Thanks!
---------------------------------------------------------------------
Mário André
Instituto Federal de Educação, Ciência e Tecnologia de Sergipe - IFS
Mestrando em MCC - Universidade Federal de Alagoas - UFAL
http://www.marioandre.com.br/
Skype: mario-fa
----------------------------------------------------------------------------
----------
-----Mensagem original-----
De: AHMET ARSLAN [mailto:iorixxx@yahoo.com]
Enviada em: quarta-feira, 30 de dezembro de 2009 15:26
Para: java-user@lucene.apache.org
Assunto: Re: Question about TokenStream lucene 3.0
> Using PorterStemFilter and removing the stopwords, but how
> can I use
> TokenStream in release 3.0 (print the result this method).
>
> I tried to use:
>
> public static void main(String[] args) throws
> IOException,
> ParseException
> {
> StringReader sr = new
> StringReader("The man is very good. He talked
> about many thigs");
> PorterStemAnalyzer ps = new
> PorterStemAnalyzer();
> TokenStream tokenstream =
> ps.tokenStream(null, sr);
>
> //Tokenizer token = (Tokenizer)
> ps.tokenStream(null, sr);
> while(tokenstream.incrementToken())
> {
> ????????
> }
>
> }
>
> Thanks.
You can use this method to display:
public static void displayTokenStream(TokenStream tokenStream) throws
IOException {
TermAttribute termAtt = (TermAttribute)
tokenStream.getAttribute(TermAttribute.class);
TypeAttribute typeAtt = (TypeAttribute)
tokenStream.getAttribute(TypeAttribute.class);
while (tokenStream.incrementToken()) {
System.out.print(termAtt.term());
System.out.print(' ');
System.out.println(typeAtt.type());
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Question about TokenStream lucene 3.0
Posted by AHMET ARSLAN <io...@yahoo.com>.
> Using PorterStemFilter and removing the stopwords, but how
> can I use
> TokenStream in release 3.0 (print the result this method).
>
> I tried to use:
>
> public static void main(String[] args) throws
> IOException,
> ParseException
> {
> StringReader sr = new
> StringReader("The man is very good. He talked
> about many thigs");
> PorterStemAnalyzer ps = new
> PorterStemAnalyzer();
> TokenStream tokenstream =
> ps.tokenStream(null, sr);
>
> //Tokenizer token = (Tokenizer)
> ps.tokenStream(null, sr);
> while(tokenstream.incrementToken())
> {
> ????????
> }
>
> }
>
> Thanks.
You can use this method to display:
public static void displayTokenStream(TokenStream tokenStream) throws IOException {
TermAttribute termAtt = (TermAttribute) tokenStream.getAttribute(TermAttribute.class);
TypeAttribute typeAtt = (TypeAttribute) tokenStream.getAttribute(TypeAttribute.class);
while (tokenStream.incrementToken()) {
System.out.print(termAtt.term());
System.out.print(' ');
System.out.println(typeAtt.type());
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org