You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2009/11/19 03:42:39 UTC

[jira] Issue Comment Edited: (LUCENE-2068) fix reverseStringFilter for unicode 4.0

    [ https://issues.apache.org/jira/browse/LUCENE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779803#action_12779803 ] 

Robert Muir edited comment on LUCENE-2068 at 11/19/09 2:42 AM:
---------------------------------------------------------------

This patch adds back compat for the buggy behavior with version.
It is gross because there were many public static methods exposed, but for example Solr is using these.

btw:
Simon, are you applying patches with Eclipse?
If so it will not work, you need to open the patch in an editor, select all, copy, and then apply from Clipboard.
In your patch, the test is corrupted, the characters should be chinese... I think this is why you were confused about tests before.

edit: sorry simon, mime-type/charset issues on my side, x-diff versus x-patch thing :)


      was (Author: rcmuir):
    This patch adds back compat for the buggy behavior with version.
It is gross because there were many public static methods exposed, but for example Solr is using these.

btw:
Simon, are you applying patches with Eclipse?
If so it will not work, you need to open the patch in an editor, select all, copy, and then apply from Clipboard.
In your patch, the test is corrupted, the characters should be chinese... I think this is why you were confused about tests before.

  
> fix reverseStringFilter for unicode 4.0
> ---------------------------------------
>
>                 Key: LUCENE-2068
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2068
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-2068.patch, LUCENE-2068.patch, LUCENE_2068.patch, LUCENE_2068.patch
>
>
> ReverseStringFilter is not aware of supplementary characters: when it reverses it will create unpaired surrogates, which will be replaced by U+FFFD by the indexer (but not at query time).
> The wrong words will conflate to each other, and the right words won't match, basically the whole thing falls apart.
> This patch implements in-place reverse with the algorithm from apache harmony AbstractStringBuilder.reverse0()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org