You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sirish Vadala <si...@gmail.com> on 2013/06/14 19:30:42 UTC

KStemFilter

Hello All,

I have a new requirement within my text search implementation to perform
stemming. I have done some research and implemented snowball, but however
the customers found it too aggressive and eventually I got them to agree to
compromise on k-stem algorithm.

Currently my existing code is on Lucene 2.9, which I would like to push to
the latest Lucene 4.3. So finally I have decided to build a custom analyzer
that implements kstem filter.

/    public class KStemAnalyzer extends Analyzer {

          @Override
          public final TokenStream tokenStream(String fieldName, Reader
reader) {
               TokenStream result = new StandardTokenizer(Version.LUCENE_43,
reader);
               result = new StandardFilter(Version.LUCENE_43, result);
               result = new LowerCaseFilter(Version.LUCENE_43, result);
               result = new StopFilter(Version.LUCENE_43, result,
StandardAnalyzer.STOP_WORDS_SET);
               return new KStemFilter(result);
          }

          @Override
          protected TokenStreamComponents createComponents(String string,
Reader reader) {
              throw new UnsupportedOperationException("Not supported yet.");
          }
    }/

However I get an error /'tokenStream(String,Reader) in KStemAnalyzer cannot
override tokenStream(String,Reader) in Analyzer overridden method is
final'/. I was looking to find some documentation or example
implementations, but all I could find is the api that is not very
descriptive.

Any hint on how to initialize this would be highly appreciated.

Thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/KStemFilter-tp4070558.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: KStemFilter

Posted by Sirish Vadala <si...@gmail.com>.
Awesome! Exactly what I was looking for.

Thanks Schindler.


Uwe Schindler wrote
> Look at the javadocs of the analysis package and the Analyzer class, there
> it is explained how Analyzers are built - the first example is the way to
> go:
> http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/analysis/Analyzer.html
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: 

> uwe@

> 
> 
>> -----Original Message-----
>> From: Sirish Vadala [mailto:

> sirishreddy@

> ]
>> Sent: Friday, June 14, 2013 7:31 PM
>> To: 

> java-user@.apache

>> Subject: KStemFilter
>> 
>> Hello All,
>> 
>> I have a new requirement within my text search implementation to perform
>> stemming. I have done some research and implemented snowball, but
>> however the customers found it too aggressive and eventually I got them
>> to
>> agree to compromise on k-stem algorithm.
>> 
>> Currently my existing code is on Lucene 2.9, which I would like to push
>> to the
>> latest Lucene 4.3. So finally I have decided to build a custom analyzer
>> that
>> implements kstem filter.
>> 
>> /    public class KStemAnalyzer extends Analyzer {
>> 
>>           @Override
>>           public final TokenStream tokenStream(String fieldName, Reader
>> reader) {
>>                TokenStream result = new
>> StandardTokenizer(Version.LUCENE_43,
>> reader);
>>                result = new StandardFilter(Version.LUCENE_43, result);
>>                result = new LowerCaseFilter(Version.LUCENE_43, result);
>>                result = new StopFilter(Version.LUCENE_43, result,
>> StandardAnalyzer.STOP_WORDS_SET);
>>                return new KStemFilter(result);
>>           }
>> 
>>           @Override
>>           protected TokenStreamComponents createComponents(String string,
>> Reader reader) {
>>               throw new UnsupportedOperationException("Not supported
>> yet.");
>>           }
>>     }/
>> 
>> However I get an error /'tokenStream(String,Reader) in KStemAnalyzer
>> cannot override tokenStream(String,Reader) in Analyzer overridden method
>> is final'/. I was looking to find some documentation or example
>> implementations, but all I could find is the api that is not very
>> descriptive.
>> 
>> Any hint on how to initialize this would be highly appreciated.
>> 
>> Thanks.
>> 
>> 
>> 
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/KStemFilter-tp4070558.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: 

> java-user-unsubscribe@.apache

>> For additional commands, e-mail: 

> java-user-help@.apache

> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: 

> java-user-unsubscribe@.apache

> For additional commands, e-mail: 

> java-user-help@.apache





--
View this message in context: http://lucene.472066.n3.nabble.com/KStemFilter-tp4070558p4070573.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: KStemFilter

Posted by Uwe Schindler <uw...@thetaphi.de>.
Look at the javadocs of the analysis package and the Analyzer class, there it is explained how Analyzers are built - the first example is the way to go:
http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/analysis/Analyzer.html

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Sirish Vadala [mailto:sirishreddy@gmail.com]
> Sent: Friday, June 14, 2013 7:31 PM
> To: java-user@lucene.apache.org
> Subject: KStemFilter
> 
> Hello All,
> 
> I have a new requirement within my text search implementation to perform
> stemming. I have done some research and implemented snowball, but
> however the customers found it too aggressive and eventually I got them to
> agree to compromise on k-stem algorithm.
> 
> Currently my existing code is on Lucene 2.9, which I would like to push to the
> latest Lucene 4.3. So finally I have decided to build a custom analyzer that
> implements kstem filter.
> 
> /    public class KStemAnalyzer extends Analyzer {
> 
>           @Override
>           public final TokenStream tokenStream(String fieldName, Reader
> reader) {
>                TokenStream result = new StandardTokenizer(Version.LUCENE_43,
> reader);
>                result = new StandardFilter(Version.LUCENE_43, result);
>                result = new LowerCaseFilter(Version.LUCENE_43, result);
>                result = new StopFilter(Version.LUCENE_43, result,
> StandardAnalyzer.STOP_WORDS_SET);
>                return new KStemFilter(result);
>           }
> 
>           @Override
>           protected TokenStreamComponents createComponents(String string,
> Reader reader) {
>               throw new UnsupportedOperationException("Not supported yet.");
>           }
>     }/
> 
> However I get an error /'tokenStream(String,Reader) in KStemAnalyzer
> cannot override tokenStream(String,Reader) in Analyzer overridden method
> is final'/. I was looking to find some documentation or example
> implementations, but all I could find is the api that is not very descriptive.
> 
> Any hint on how to initialize this would be highly appreciated.
> 
> Thanks.
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/KStemFilter-tp4070558.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org