You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Dalius (Closed) (JIRA)" <ji...@apache.org> on 2012/02/08 10:40:59 UTC
[jira] [Closed] (SOLR-3106) Wildcard ? issue

     [ https://issues.apache.org/jira/browse/SOLR-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dalius closed SOLR-3106.
------------------------


Thanks for references.
                
> Wildcard ? issue
> ----------------
>
>                 Key: SOLR-3106
>                 URL: https://issues.apache.org/jira/browse/SOLR-3106
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 3.5
>         Environment: Tomcat 7.0.25 (request encoding UTF-8)
> Solr 3.5.0
> Java 7 Oracle
> Ubuntu 11.10
>            Reporter: Dalius
>
> Sorry for inaccurate title.
> I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) containing same value:
> {code}
> <title xmlns="http://www.tei-c.org/ns/1.0">cal•lígraf</title>
> {code}
> and these fields are configured accordingly:
> {code}
>     <fieldType name="xml" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.ICUFoldingFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.ICUFoldingFilterFactory"/>
>       </analyzer>
>     </fieldType>
>     
>     <fieldType name="xml_unicode" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       </analyzer>
>     </fieldType>
>     
>     <fieldType name="xml_unicode_full" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>       </analyzer>
>     </fieldType>
> {code}
> And finally my search configuration:
> {code}
>     <requestHandler name="dictionary" class="solr.SearchHandler">
>          <lst name="defaults">
>            <str name="echoParams">all</str>
>            <str name="defType">edismax</str>
>            <str name="mm">2&lt;-25%</str>
>            <str name="qf">dc_title_unicode_full^2 dc_title_unicode^2 dc_title</str>
>            <int name="rows">10</int>
>            <str name="spellcheck.onlyMorePopular">true</str>
>            <str name="spellcheck.extendedResults">false</str>
>            <str name="spellcheck.count">1</str>
>          </lst>
>         <arr name="last-components">
>           <str>spellcheck</str>
>         </arr>
>     </requestHandler>
> {code}
> I am trying to match the field with various search phrases (that are valid). There are results:
> || \# || search phrase || match? || Comment ||
> | 1 | cal•lígra? | (/) | |
> | 2 | cal•ligra? | (x) | Changed í to i |
> | 3 | cal•ligraf | (/) | |
> | 4 | calligra? | (/) | |
> The problem is the #2 attempt to match a data. The #3 works replacing ? with f.
> One more thing. If * is used insted of ? other data is matched as cal•lígrafia but not cal•lígraf...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org