You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Amin Mohammed-Coleman <am...@gmail.com> on 2011/01/20 22:37:15 UTC

Wildcard Case Sensitivity

Hi

Apologies up front if this question has been asked before.

I have a document which contains a field that stores an untokenized value such as TEST_TYPE.  The analyser used is StandardAnalyzer and I pass the same analyzer into the query.  I perform the following query : fieldName:TEST_*, however this does not return any results.  Is this the expected behaviour?  Can I use capital letters in my wildcard query or do i need to do some processing before passing it to the query parser? 


Any help would be appreciated.

Thanks
Amin
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Wildcard Case Sensitivity

Posted by Jack Krupansky <ja...@lucidimagination.com>.
Wildcards only work for a single term. At index time the underscore in 
TEST_TYPE is treated as if it were a space separator, producing two terms. 
At query time the existence of the wildcard suppresses ALL analysis of the 
term (although that behavior may vary between query parsers), so that the 
underscore is kept at query time even though StandardAnalyzer (or 
WordDelimiterFilter or similar filter) would produce a text stream without 
any underscores.

StandardAnalyzer lower cases its output, so there is no case sensitivity. 
But, if you happen to use upper case in a wildcard query term that query 
term will fail to match anything because the existence of the wildcard 
suppresses all analysis, including the lowercasing. Once again, different 
query parsers may behave differently.

-- Jack Krupansky

-----Original Message----- 
From: Amin Mohammed-Coleman
Sent: Thursday, January 20, 2011 4:37 PM
To: java-user@lucene.apache.org
Subject: Wildcard Case Sensitivity

Hi

Apologies up front if this question has been asked before.

I have a document which contains a field that stores an untokenized value 
such as TEST_TYPE.  The analyser used is StandardAnalyzer and I pass the 
same analyzer into the query.  I perform the following query : 
fieldName:TEST_*, however this does not return any results.  Is this the 
expected behaviour?  Can I use capital letters in my wildcard query or do i 
need to do some processing before passing it to the query parser?


Any help would be appreciated.

Thanks
Amin
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org