You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pla Gong <pg...@marketlive.com> on 2011/04/22 09:58:26 UTC

Multi-word Solr Synonym issue

I am trying to do a simple mapping of a 2 word term to a 1 word term and
it does not work. See my configuration at the bottom of the email. My
scenario is that I have a term called "pond care" and I want to map it
to the term "fountain".  So whenever a user enters the term "pond care"
in the search box, I want Solr to search on the word "fountain".
Searching on  "fountain" or "pond food" should returns the same number
of products.  I try all type of filter combination and cannot get it to
work.  I use the Solr analysis and "pond food" does map to "fountain"
but when test on Solr Admin, the search would not query on "fountain"
but only on "pond care".  Here is my log from solr Admin search:

- <lst name="debug">
- <lst name="queryBoosting">
  <str name="q">fountain</str> 
  <null name="match" /> 
  </lst>
  <str name="rawquerystring">pond food</str> 
  <str name="querystring">pond food</str> 
  <str name="parsedquery">+text:pond +text:food</str> 
  <str name="parsedquery_toString">+text:pond +text:food</str> 
- <lst name="explain">
  <str name="catalogItem.155.120.1">1.4865229 = (MATCH) sum of:
0.5013988 = (MATCH) weight(text:pond in 1137), product of: 0.5317582 =
queryWeight(text:pond), product of: 3.180518 = idf(docFreq=730,
maxDocs=6470) 0.16719232 = queryNorm 

I am new to Solr and I have Google the issue but I did not find a
solution that will work for my case.  Please let me know if you  have
encounter this issue and how you resolved it and what configuration you
used.  I want term to term mapping results from the query and not a
combination of the two terms.

I would greatly appreciate any help.

Thanks,
Pla

-----------field type and filter Configuration

    <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
      </analyzer>
    </fieldType>

Re: Multi-word Solr Synonym issue

Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Multi-word Solr Synonym issue
: In-Reply-To: <BA...@mail.gmail.com>

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.



-Hoss

Re: Multi-word Solr Synonym issue

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,

Maybe you are doing query-time synonym expansion?
Try changing that to do index-time synonym expansion.

See 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory


Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Pla Gong <pg...@marketlive.com>
> To: solr-user@lucene.apache.org
> Sent: Fri, April 22, 2011 3:58:26 AM
> Subject: Multi-word Solr Synonym issue
> 
> I am trying to do a simple mapping of a 2 word term to a 1 word term and
> it  does not work. See my configuration at the bottom of the email. My
> scenario  is that I have a term called "pond care" and I want to map it
> to the term  "fountain".  So whenever a user enters the term "pond care"
> in the  search box, I want Solr to search on the word "fountain".
> Searching on   "fountain" or "pond food" should returns the same number
> of products.  I  try all type of filter combination and cannot get it to
> work.  I use the  Solr analysis and "pond food" does map to "fountain"
> but when test on Solr  Admin, the search would not query on "fountain"
> but only on "pond  care".  Here is my log from solr Admin search:
> 
> - <lst  name="debug">
> - <lst name="queryBoosting">
>   <str  name="q">fountain</str> 
>   <null name="match" /> 
>   </lst>
>   <str name="rawquerystring">pond  food</str> 
>   <str name="querystring">pond food</str> 
>   <str name="parsedquery">+text:pond +text:food</str> 
>   <str name="parsedquery_toString">+text:pond  +text:food</str> 
> - <lst name="explain">
>   <str  name="catalogItem.155.120.1">1.4865229 = (MATCH) sum of:
> 0.5013988 =  (MATCH) weight(text:pond in 1137), product of: 0.5317582  =
> queryWeight(text:pond), product of: 3.180518 =  idf(docFreq=730,
> maxDocs=6470) 0.16719232 = queryNorm 
> 
> I am new to  Solr and I have Google the issue but I did not find a
> solution that will work  for my case.  Please let me know if you  have
> encounter this issue  and how you resolved it and what configuration you
> used.  I want term to  term mapping results from the query and not a
> combination of the two  terms.
> 
> I would greatly appreciate any  help.
> 
> Thanks,
> Pla
> 
> -----------field type and filter  Configuration
> 
>     <fieldType name="text"  class="solr.TextField"
> positionIncrementGap="100">
>        <analyzer type="index">
>         <tokenizer  class="solr.WhitespaceTokenizerFactory"/>
>          <filter class="solr.StopFilterFactory"
>                  ignoreCase="true"
>                  words="stopwords.txt"
>                  enablePositionIncrements="true"
>                  />
>          <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"  generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0"  splitOnCaseChange="1"/>
>         <filter  class="solr.LowerCaseFilterFactory"/>
>          <filter class="solr.SnowballPorterFilterFactory"
> language="English"  protected="protwords.txt"/>
>        </analyzer>
>       <analyzer  type="query">
>         <tokenizer  class="solr.KeywordTokenizerFactory"/>
>          <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"  ignoreCase="true" expand="true"/>
>         <filter  class="solr.StopFilterFactory"
>                  ignoreCase="true"
>                  words="stopwords.txt"
>                  enablePositionIncrements="true"
>                  />
>         <filter  class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"  generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0"  splitOnCaseChange="1"/>
>         <filter  class="solr.LowerCaseFilterFactory"/>
>          <filter class="solr.SnowballPorterFilterFactory"
> language="English"  protected="protwords.txt"/>
>        </analyzer>
>     </fieldType>
>