You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by bu...@apache.org on 2004/04/03 23:50:34 UTC

DO NOT REPLY [Bug 28182] New: - [PATCH] Never write an Analyzer again

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=28182>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28182

[PATCH] Never write an Analyzer again

           Summary: [PATCH] Never write an Analyzer again
           Product: Lucene
           Version: CVS Nightly - Specify date in submission
          Platform: Other
        OS/Version: Other
            Status: NEW
          Severity: Enhancement
          Priority: Other
         Component: Analysis
        AssignedTo: lucene-dev@jakarta.apache.org
        ReportedBy: grant_ingersoll@yahoo.com


Hi All,

I got sick of writing Analyzers, so I have re-worked some of the Analyzer and Filter code by making the 
TokenStream an interface (and Tokenizer and TokenFilter).  I then created a BaseAnalyzer class that you 
set a tokenizer on and you set a list of TokenFilters.  The tokenStream() method then applies the 
tokenizer and then loops over the list of TokenFilters, applying each one in order and returning the last 
one, just as I am sure you have done many a time before.  One requirement for this to work is that the 
Filters and Tokenizers must allow any state information to be re-initialized through the init() method 
on TokenStream.  

Also created AbstractTokenizer and AbstractTokenFilter which are trivial implementations of Tokenizer 
and TokenFilter respectively.  I have made all existing tokenizers and filters backwards compatible.

Let me know if you like or dislike and what changes you would like me to make.  I ran all regression 
tests and they all worked.  I also wrote a TestBaseAnalyzer to test my new Analyzer.  See the Test for 
usage of the Analyzer.  I haven't done a full scale indexing test on it yet, but will soon.

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org