You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Ard Schrijvers (JIRA)" <ji...@apache.org> on 2013/02/05 18:38:11 UTC

[jira] [Resolved] (JCR-3511) JackrabbitQueryParser incorrectly handles terms with wildcards when using analyzers that do more than lowercasing

     [ https://issues.apache.org/jira/browse/JCR-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers resolved JCR-3511.
---------------------------------

    Resolution: Won't Fix

turns out to be quite a bit more complex, as it already fails in the Lucene QueryParser as well. Think client code should just remove diacritics first if they want to do freetext search and have indexed with an analyser that removed diacritics
                
> JackrabbitQueryParser incorrectly handles terms with wildcards when using analyzers that do more than lowercasing 
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: JCR-3511
>                 URL: https://issues.apache.org/jira/browse/JCR-3511
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>            Reporter: Ard Schrijvers
>            Assignee: Ard Schrijvers
>             Fix For: 2.2.14, 2.4.4
>
>
> wildcard pre/postfixing combined with stemming is not always possible to work correctly in Lucene. However, postfixing a term with a wildcard should play nicely with the configured analyzers. Assume you have an analyzer that contains Lucene ISOLatin1AccentFilter. In that case, there is currently the problem that when for example indexing the word 'très' (mind the è accent) and then quering 
> //*[jcr:contains(.',trè*')] does not have a hit for très. 
> //*[jcr:contains(.',très')] DOES and
> //*[jcr:contains(.',tr*')] DOES but
> //*[jcr:contains(.',trè*')] DOES NOT
> Problem is simple to solve as in JackrabbitQueryParser#getWildcardQuery gets the non-analyzed termStr as argument where afaics it should get the analyzed version. Then, also  getLowercaseExpandedTerms() in #getWildcardQuery is redundant

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira