You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Steven Rowe (JIRA)" <ji...@apache.org> on 2008/05/05 06:40:55 UTC

[jira] Updated: (LUCENE-1279) RangeQuery and RangeFilter should use collation to check for range inclusion

     [ https://issues.apache.org/jira/browse/LUCENE-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steven Rowe updated LUCENE-1279:
--------------------------------

    Attachment: LUCENE-1279.patch

Attaching a patch containing class CollatingRangeQuery, which extends RangeQuery, overriding the rewrite() method.  A test class is also supplied.  This is targetted at contrib/.

Because *all* index terms in the Field of the lower and upper terms of the range have to be examined, since index term ordering (Unicode code point order) is not necessarily the same as the collation in the given Locale, CollatingRangeQuery's will be significantly slower than the RangeQuery's.

One of the tests uses some of the Farsi information Esra supplied in the original post.  Note that neither Java 1.4.2 nor 1.5.0 contains collation information for Farsi.  Instead, the test uses the Arabic Locale, which appears to contain the proper letter ordering for the non-Arabic Farsi letters.

> RangeQuery and RangeFilter should use collation to check for range inclusion
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-1279
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1279
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.3.1
>            Reporter: Steven Rowe
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: LUCENE-1279.patch
>
>
> See [this java-user discussion|http://www.nabble.com/lucene-farsi-problem-td16977096.html] of problems caused by Unicode code-point comparison, instead of collation, in RangeQuery.
> RangeQuery could take in a Locale via a setter, which could be used with a java.text.Collator and/or CollationKey's, to handle ranges for languages which have alphabet orderings different from those in Unicode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org