You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by satya swaroop <ss...@gmail.com> on 2010/07/29 10:00:55 UTC

spell checking problem

hi all,
      i need some help in spellchecking.i configured my solrconfig and
schema by looking the usermailing list and here i give you the configuration
i made..

my schema.xml::::::
------------------------
 <fieldType name="spellText" class="solr.TextField"
positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>

 <field name="spell" type="spellText" indexed="true" stored="true"
multiValued="true"/>

<copyField source="*" dest="spell"/>



my solrconfig.xml:::::::::
--------------------------
  <requestHandler name="spellchecker" class="solr.SearchHandler"
startup="lazy">
    <lst name="defaults">
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck.onlyMorePopular">false</str>
      <str name="spellcheck.extendedResults">false</str>
      <str name="spellcheck.count">5</str>

    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>



 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spellText</str>

    <lst name="spellchecker">
      <str name="name">default</str>
      <str name="field">name</str>           <!-- the default field in
solrconfig.... if i change to spell field then the dictionary is not created
-->
      <str name="spellcheckIndexDir">./spell</str>
      <str name="buildOnCommit">true</str>
    </lst>

    <!-- a spellchecker that uses a different distance measure-->
    <lst name="spellchecker">
      <str name="name">jarowinkler</str>
      <str name="field">spell</str>
      <str
name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
      <str name="spellcheckIndexDir">./spellcheckerjaro</str>
    </lst>


  </searchComponent>




1)the problem here is for the default dictionary the index is getting
created and if i write "jawa" the suggestions it gives are data,sata.. but
the actual sugest is "java". I nearly have 20 java docs indexed....
2)another problem ::: if i make build to jarowinkler dictionary which is
using the "spell" field is not going to create the dictionary and i only see
segments.gen and segments_1 in its directory....


regards,
satya