You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2011/03/03 04:23:03 UTC

[Solr Wiki] Update of "UnicodeCollation" by RobertMuir

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "UnicodeCollation" page has been changed by RobertMuir.
The comment on this change is: add an example for ICU collation.
http://wiki.apache.org/solr/UnicodeCollation?action=diff&rev1=4&rev2=5

--------------------------------------------------

  = Unicode Collation =
- <!> [[Solr1.5]]
+ <!> [[Solr3.1]]
  
  == Overview ==
  [[http://en.wikipedia.org/wiki/Unicode_collation_algorithm|Unicode Collation]] is a method to sort text in a language-sensitive way. It is primarily intended for sorting, but can also be used for advanced search purposes.
@@ -144, +144 @@

  
  Please note that the strange output you see from the filter is really a binary collation key encoded in a special form. What is important is that it is the same value for equivalent tokens as defined by that collator.
  
+ == ICU Collation ==
+ 
+ For better performance, less memory usage, and support for more locales, you can add the analysis-extras contrib and use ICUCollationKeyFilterFactory instead. See the [[http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/analysis-extras/src/java/org/apache/solr/analysis/ICUCollationKeyFilterFactory.java|javadocs]] for more information.
+ 
+ In general, the principles are the same, you just specify an RFC3066 language identifier with the locale parameter instead of specifying language+country+variant.
+ 
+ For example, to get German phonebook sort order:
+ 
+ {{{
+ <fieldType name="collatedICU" class="solr.TextField">
+   <analyzer>
+     <tokenizer class="solr.KeywordTokenizerFactory"/>
+     <filter class="solr.ICUCollationKeyFilterFactory"
+         locale="de@collation=phonebook"
+         strength="primary"
+     />
+   </analyzer>
+ </fieldType>
+ }}}
+ 
+ To use this filter, see solr/contrib/analysis-extras/README.txt for instructions on which jars you need to add to your SOLR_HOME/lib
+