You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by revathy arun <re...@gmail.com> on 2009/02/18 11:33:55 UTC

solr 1.3 analyzers

HI ,

In the solr 1.3 under src/classes/java/analyzers

i see only the following  language specific tokenizer
chinestokenizer
cjktokenizer
russiantokenizer

but i see filterfactories for other languages like dutch ,french,barzialian
etc but no tokenizer
in this scenario are we supposed to use the standard tokenizer and the
corresponding language filters.Lucene has the analyzers for the same.how do
we incorporate the same to solr

Will this be available in future versions?

what is the difference netween normal filter factory and stem filter
factory?

Regards

Re: solr 1.3 analyzers

Posted by AHMET ARSLAN <io...@yahoo.com>.

> i see filterfactories for other languages like dutch
> ,french,barzialian etc but no tokenizer.  in this scenario are we >supposed to use the standard tokenizer and the corresponding language >filters. 

Yes. Exactly the same as what Lucene Analyzers do.

>Lucene has the analyzers for the same. how do we incorporate the same to >solr Will this be available in future versions?

One can also specify an existing Lucene Analyzer class that has a         default constructor via the class attribute on the analyzer element
<fieldType name="text_greek" class="solr.TextField">
   <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>

> what is the difference netween normal filter factory and stem filter
> factory?

TokenFilters can delete (StopFilter), inject (SynonymFilter), modify(StemFilter) a token according to its purpose. There is no distinction such as normal filter factory and stem filter factory.