You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by surya <ks...@gmail.com> on 2018/08/14 11:35:43 UTC

solr 3.4 do not want to apply synonym mapping term to search matching docs

The following data is getting indexed-in to our solr.

doc1:
<name>University of Virginia </name>
doc2:
<name>Katrina Uva </name>
doc3:
<name>University of new york </name>

synonym.txt
   University of Virginia, uva

search term:
   University of Virginia

Expected result:
   doc1

Actual result:
   doc1 and doc2 

the second document is coming because the synonym term "uva" is matching
with doc2: Katrina Uva
Requirement:
We do not want to apply the synonym (uva) to bring the  doc2 (Katie Uva)

The following is our solr setup version 3.4 analyzers 

/<fieldType name="typeahead" class="solr.TextField">
	<analyzer type="index">
		<tokenizer class="solr.WhitespaceTokenizerFactory"/>
		<filter class="solr.LowerCaseFilterFactory"/>
		<filter class="solr.ISOLatin1AccentFilterFactory"/>
              <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
		<filter class="solr.LowerCaseFilterFactory"/>		
		<filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
minGramSize="1"/>
              <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
	</analyzer>
	<analyzer type="query">
		<tokenizer class="solr.KeywordTokenizerFactory"/>
		<filter class="solr.LowerCaseFilterFactory"/>
		<filter class="solr.ISOLatin1AccentFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
		<filter class="solr.PatternReplaceFilterFactory" pattern="^(.{20})(.*)?"
replacement="$1" replace="all"/>
              <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
	</analyzer>
    </fieldType>
/



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: solr 3.4 do not want to apply synonym mapping term to search matching docs

Posted by ksurya <su...@gmail.com>.
If we are willing to write the code to change according to our requirement
here, Where and how we should proceed. Any heads-up that you can provide to
help achieve it.

Thanks.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: solr 3.4 do not want to apply synonym mapping term to search matching docs

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/14/2018 5:35 AM, surya wrote:
> The following data is getting indexed-in to our solr.
>
> doc1:
> <name>University of Virginia </name>
> doc2:
> <name>Katrina Uva </name>
> doc3:
> <name>University of new york </name>
>
> synonym.txt
>    University of Virginia, uva
>
> search term:
>    University of Virginia
>
> Expected result:
>    doc1
>
> Actual result:
>    doc1 and doc2 
>
> the second document is coming because the synonym term "uva" is matching
> with doc2: Katrina Uva
> Requirement:
> We do not want to apply the synonym (uva) to bring the  doc2 (Katie Uva)

Unless you want to write special code to change how Solr works, you
cannot pick and choose to apply synonyms to some documents but not
others.  The synonyms are always going to apply.  The synonyms you have
chosen will cause this match to happen.

You could instead use this format in your synonyms file for a one-way
translation, but then you would not be able to do the search for the
full text and match documents where "UVA" is actually used to mean the
university:

university of virginia => uva

Multi-word synonyms don't work properly unless the 'sow' parameter is
set to false.  This is the default setting since 7.0.

Thanks,
Shawn