You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Zheng Lin Edwin Yeo <ed...@gmail.com> on 2019/01/14 02:24:16 UTC

WordBreakSolrSpellChecker returning weird results

Hi,

I am trying on the WordBreakSolrSpellChecker, and I found that the results
that it is returning is quite weird.

For example, for this query
http://localhost:8983/solr/collection1/spell?q=1006&fl=null, which has
exact match in my index, I am getting the following results returned.

  "spellcheck":{
    "suggestions":[
      "1006",{
        "numFound":7,
        "startOffset":1,
        "endOffset":5,
        "origFreq":0,
        "suggestion":[{
            "word":"1 006",
            "freq":53043},
          {
            "word":"100 6",
            "freq":31220},
          {
            "word":"10 06",
            "freq":3828},
          {
            "word":"1 0 06",
            "freq":58996},
          {
            "word":"10 0 6",
            "freq":58996},
          {
            "word":"1 00 6",
            "freq":53043},
          {
            "word":"1 0 0 6",
            "freq":58996}]}],
    "correctlySpelled":false,

For example, for this query
http://localhost:8983/solr/collection1/spell?q=1006&fl=null, I am
getting the following results returned


Below is my configuration in solrconfig.xml.
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">content</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

   <requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <!-- Solr will use suggestions from both the 'default' spellchecker
           and from the 'wordbreak' spellchecker and combine them.
           collations (re-written queries) can include a combination of
           corrections from both spellcheckers -->
      <!--<str name="spellcheck.dictionary">default</str>-->
      <str name="spellcheck.dictionary">wordbreak</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">true</str>
      <str name="spellcheck.count">10</str>
      <str name="spellcheck.alternativeTermCount">5</str>
      <str name="spellcheck.maxResultsForSuggest">5</str>
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.collateExtendedResults">true</str>
      <str name="spellcheck.maxCollationTries">10</str>
      <str name="spellcheck.maxCollations">5</str>
      <str name="wt">json</str>
      <str name="indent">true</str>
    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>

Any idea on how we can prevent this?
I am using Solr 7.5.0


Regards,
Edwin


.