You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Michael Bulla <Mi...@iteratec.de> on 2013/11/15 08:59:13 UTC

Is there a max Size for synony-Definition?

Hi there,

yesterday I had a strange problem with using synonyms in Solr 4.3.0

In my schema there is the default-configuration for synonyms defined

      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>

Everything works fine with that config, except this line

Combustion, combustión, gases, gas, humo, humos, analizador, analizadores, emisiones, contaminante, O2, oxígeno, oxigeno, carbono, NOx, NO, NO2, SO2, CO, H2S, HC, hidrocarburos, inquemados, quemador, caldera, horno, chimenea

When searching for any of that terms, I don't get any result. Removing special chars didn't make it better, removing numeric didn't make it better. When shorting that list down to ~90 characters (10 terms) I got results again. Is there some kind of length constraint when using synonyms?

Regards,
Michael


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Michael Bulla
_______________________________________________________
iteratec GmbH
Am Sandtorkai 73
20457 Hamburg

mailto: michael.bulla@iteratec.de<ma...@iteratec.de>
phone: +49 40 28 46 830 - 31
fax: +49 40 28 46 830 - 10
http://www.iteratec.de<http://www.iteratec.de/>
http://twitter.com/iteratec

iteratec ist Hamburgs Bester IT-Arbeitgeber 2012 <http://www.iteratec.de/content/iteratec-gmbh-mit-dem-g%C3%BCtesiegel-hamburgs-beste-arbeitgeber-ausgezeichnet>
_______________________________________________________
Sitz und Registergericht der iteratec GmbH: München HRB 113 519
Geschäftsführer: Klaus Eberhardt, Mark Goerke, Inge Hanschke