You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Ard Schrijvers (JIRA)" <ji...@apache.org> on 2013/02/05 12:37:13 UTC

[jira] [Created] (JCR-3511) JackrabbitQueryParser incorrectly handles terms with wildcards when using analyzers that do more than lowercasing

Ard Schrijvers created JCR-3511:
-----------------------------------

             Summary: JackrabbitQueryParser incorrectly handles terms with wildcards when using analyzers that do more than lowercasing 
                 Key: JCR-3511
                 URL: https://issues.apache.org/jira/browse/JCR-3511
             Project: Jackrabbit Content Repository
          Issue Type: Bug
            Reporter: Ard Schrijvers
            Assignee: Ard Schrijvers
             Fix For: 2.2.14, 2.4.4


wildcard pre/postfixing combined with stemming is not always possible to work correctly in Lucene. However, postfixing a term with a wildcard should play nicely with the configured analyzers. Assume you have an analyzer that contains Lucene ISOLatin1AccentFilter. In that case, there is currently the problem that when for example indexing the word 'très' (mind the è accent) and then quering 

//*[jcr:contains(.',trè*')] does not have a hit for très. 

//*[jcr:contains(.',très')] DOES and
//*[jcr:contains(.',tr*')] DOES but
//*[jcr:contains(.',trè*')] DOES NOT

Problem is simple to solve as in JackrabbitQueryParser#getWildcardQuery gets the non-analyzed termStr as argument where afaics it should get the analyzed version. Then, also  getLowercaseExpandedTerms() in #getWildcardQuery is redundant



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira