You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Manupriya <ma...@gmail.com> on 2009/01/20 07:42:50 UTC

Searching for 'A*' is not returning me same result as 'a*'

Hi,

I am using the following analyser for indexing and querying -
------------------------------------------------------------------------------------------------------
 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
-----------------------------------------------------------------------------------------------------

I search using Solr admin console. When I search for -  institutionName:a*,
I get 93 matching records. But when I search for - institutionName:A*, I DO
NOT get any matching records.

I did field Analysis for a* and A* for the analyzer configuration.

For a*
------
  http://www.nabble.com/file/p21557926/a-analysis.gif 


For A*
------
  http://www.nabble.com/file/p21557926/A1-analysis.gif 

As per my understanding, analyzer is working fine in both the case. I am not
able to understand, why query is not returning me any result for A*?
:confused:

I feel that I am missing out something, can anyone help me with that?

Regards,
Manu
-- 
View this message in context: http://www.nabble.com/Searching-for-%27A*%27-is-not-returning-me-same-result-as-%27a*%27-tp21557926p21557926.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searching for 'A*' is not returning me same result as 'a*'

Posted by Manupriya <ma...@gmail.com>.
I got the answer to my problem. This is happening because I am using
wildcard. Wildcard queries are not passed through Analyzer.

http://wiki.apache.org/lucene-java/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695
http://markmail.org/message/25wm4mrdhs6yqnck#query:upper%20case%20solr+page:1+mid:7c6bf6e7p755eu67+state:results
http://www.mail-archive.com/solr-user@lucene.apache.org/msg08542.html

Thanks,
Manu



Manupriya wrote:
> 
> Hi,
> 
> I am using the following analyser for indexing and querying -
> ------------------------------------------------------------------------------------------------------
>  <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>          <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>     </fieldType>
> -----------------------------------------------------------------------------------------------------
> 
> I search using Solr admin console. When I search for - 
> institutionName:a*, I get 93 matching records. But when I search for -
> institutionName:A*, I DO NOT get any matching records.
> 
> I did field Analysis for a* and A* for the analyzer configuration.
> 
> For a*
> ------
>   http://www.nabble.com/file/p21557926/a-analysis.gif 
> 
> 
> For A*
> ------
>   http://www.nabble.com/file/p21557926/A1-analysis.gif 
> 
> As per my understanding, analyzer is working fine in both the case. I am
> not able to understand, why query is not returning me any result for A*?
> :confused:
> 
> I feel that I am missing out something, can anyone help me with that?
> 
> Regards,
> Manu
> 

-- 
View this message in context: http://www.nabble.com/Searching-for-%27A*%27-is-not-returning-me-same-result-as-%27a*%27-tp21557926p21560742.html
Sent from the Solr - User mailing list archive at Nabble.com.