You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2009/11/18 07:00:43 UTC
[jira] Updated: (SOLR-1571) unicode collation support
[ https://issues.apache.org/jira/browse/SOLR-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated SOLR-1571:
------------------------------
Attachment: SOLR-1571.patch
initial patch.
> unicode collation support
> -------------------------
>
> Key: SOLR-1571
> URL: https://issues.apache.org/jira/browse/SOLR-1571
> Project: Solr
> Issue Type: New Feature
> Components: Analysis
> Reporter: Robert Muir
> Priority: Minor
> Attachments: SOLR-1571.patch
>
>
> This patch adds support for unicode collation (searching and sorting).
> Unicode collation is helpful in a search engine, for many languages you want things to match or sort differently.
> You might even want to use copyfield and support different sort orders/matching schemes if you need to support multiple languages.
> This is simply a factory for lucene's CollationKeyFilter, which indexes binary collation keys in a special format that preserves binary sort order.
> I've added support for creating a Collator in two ways:
> * system collator from a Locale spec (language + country + variant)
> * tailored collator from custom rules in a text file
> in no way is there an option to use the "default" locale of the jvm, (I consider this a bit dangerous)
> in this patch, it is mandatory to define the locale explicitly for a system collator.
> The required lucene-collation-2.9.1.jar is only 12KB.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.