You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by agorriz <ag...@tacitknowledge.com> on 2018/08/08 13:37:37 UTC

[Solr 7.1.0] spellcheck.maxCollationTries > 0 no results

I have a problem with solr suggested terms, when I search for a miss spelled
phrase or word, for example "halogan balbs" (0 results found) I want a
suggestion which will lead to results (eg "halogen bulbs").

I'm able to get a suggested phrase enabling spellcheck.collation and
spellcheck.maxCollationTries = 0, but unfortunately the suggested phrase
does not always generate results (eg. searching for "fence panel" (1 result)
suggests "face paper" (0 results)

According to documentation, in order to bypass the problem of 0 results on
the collated query I can configure spellcheck.maxCollationTries > 0, but by
doing so I noticed that the returned collation is always empty, even when
the single suggested words collated would generate results.

My question is, why is that happening and how can I avoid it?

Following an example of query for "halogen balbs" that does not work as I'm
expecting:

http://localhost:8983/solr/master_Product_default/select?fq=(catalogId:%22ProductCatalog%22%20AND%20catalogVersion:%22Online%22)&q=((code_string:halogan^100.0))%20OR%20((code_string:balbs^100.0))%20OR%20((code_string:%22halogan%20balbs%22~10.0^100.0)%20OR%20(brand.search_text_mv:%22halogan%20balbs%22~10.0^300.0)%20OR%20(categoryName_text_en_mv:%22halogan%20balbs%22~10.0^700.0)%20OR%20(type.search_text_mv:%22halogan%20balbs%22~10.0^800.0)%20OR%20(name_text_en:%22halogan%20balbs%22~10.0^500.0))&rows=20&spellcheck.dictionary=default&spellcheck.q=halogan%20balbs&spellcheck=true&spellcheck.collate=true&spellcheck.extendedResults=true&spellcheck.collateExtendedResults=true&spellcheck.count=100&spellcheck.maxCollationTries=500

that query returns the following:

"spellcheck":{
    "suggestions":[
      "halogan",{
        "numFound":1,
        "startOffset":0,
        "endOffset":7,
        "origFreq":0,
        "suggestion":[{
            "word":"halogen",
            "freq":84}]},
      "balb",{
        "numFound":1,
        "startOffset":8,
        "endOffset":13,
        "origFreq":0,
        "suggestion":[{
            "word":"bulb",
            "freq":198}]}],
    "correctlySpelled":false,
    "collations":[]}}

Note that halogen and bulb is returned as single suggestion but collations
is empty, whilst if I run the query with "spellcheck.maxCollationTries=0"
then I get "halogen bulb" as suggested collation query:

  "spellcheck":{
    "suggestions":[
      "halogan",{
        "numFound":1,
        "startOffset":0,
        "endOffset":7,
        "origFreq":0,
        "suggestion":[{
            "word":"halogen",
            "freq":84}]},
      "balb",{
        "numFound":1,
        "startOffset":8,
        "endOffset":13,
        "origFreq":0,
        "suggestion":[{
            "word":"bulb",
            "freq":198}]}],
    "correctlySpelled":false,
    "collations":[
      "collation",{
        "collationQuery":"halogen bulb",
        "hits":0,
        "misspellingsAndCorrections":[
          "halogan","halogen",
          "balb","bulb"]}]}} 

I would expect this behaviour to happen if searching for "halogen bulb"
returns 0 results, but in this particular case the search returns results:

http://localhost:8983/solr/master_Product_default/select?fq=(catalogId:%22ProductCatalog%22%20AND%20catalogVersion:%22Online%22)&q=((code_string:halogen^100.0))%20OR%20((code_string:bulb^100.0))%20OR%20((code_string:%22halogen%20bulb%22~10.0^100.0)%20OR%20(brand.search_text_mv:%22halogen%20bulb%22~10.0^300.0)%20OR%20(categoryName_text_en_mv:%22halogen%20bulb%22~10.0^700.0)%20OR%20(type.search_text_mv:%22halogen%20bulb%22~10.0^800.0)%20OR%20(name_text_en:%22halogen%20bulb%22~10.0^500.0))&rows=20&spellcheck.dictionary=default&spellcheck.q=halogen%20bulb&spellcheck=true&spellcheck.collate=true&spellcheck.extendedResults=true&spellcheck.collateExtendedResults=true&spellcheck.count=100&spellcheck.maxCollationTries=500

returns:

 "response":{"numFound":42,"start":0,"docs":[
      {...}





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: [External] [Solr 7.1.0] spellcheck.maxCollationTries > 0 no results

Posted by agorriz <ag...@tacitknowledge.com>.
Thanks James,

I found a solution for my problem, using
spellcheck.q=spellcheck_en:halogan%20balbs seems to work. 

But I would expect it to work without needing to set the spellcheck field on
the query (it is already configured on solrconfig.xml)

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
        <str name="queryAnalyzerFieldType">text_spell</str>
        <lst name="spellchecker">
            <str name="name">default</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
            <str name="field">spellcheck_en</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">en</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
            <str name="field">spellcheck_en</str>
        </lst>
</searchComponent>



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: [External] [Solr 7.1.0] spellcheck.maxCollationTries > 0 no results

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
It doesn't appear to me that the collator works with "spellcheck.q".  Looking at the unit test (SpellCheckCollatorTest.java), this is not a use-case that is being tested.  I opened https://issues.apache.org/jira/browse/SOLR-12650 to track this bug.

As a workaround, you can remove "spellcheck.q" and it might work.  You also probably want smaller values for spellcheck.count and spellcheck.maxCollationTries, maybe 10-20 for these.

James Dyer
Ingram Content Group

From: agorriz [mailto:agorriz@tacitknowledge.com]
Sent: Wednesday, August 08, 2018 8:38 AM
To: solr-user@lucene.apache.org
Subject: [External] [Solr 7.1.0] spellcheck.maxCollationTries > 0 no results

I have a problem with solr suggested terms, when I search for a miss spelled
phrase or word, for example "halogan balbs" (0 results found) I want a
suggestion which will lead to results (eg "halogen bulbs").

I'm able to get a suggested phrase enabling spellcheck.collation and
spellcheck.maxCollationTries = 0, but unfortunately the suggested phrase
does not always generate results (eg. searching for "fence panel" (1 result)
suggests "face paper" (0 results)

According to documentation, in order to bypass the problem of 0 results on
the collated query I can configure spellcheck.maxCollationTries > 0, but by
doing so I noticed that the returned collation is always empty, even when
the single suggested words collated would generate results.

My question is, why is that happening and how can I avoid it?

Following an example of query for "halogen balbs" that does not work as I'm
expecting:

http://localhost:8983/solr/master_Product_default/select?fq=(catalogId:%22ProductCatalog%22%20AND%20catalogVersion:%22Online%22)&q=((code_string:halogan^100.0))%20OR%20((code_string:balbs^100.0))%20OR%20((code_string:%22halogan%20balbs%22~10.0^100.0)%20OR%20(brand.search_text_mv:%22halogan%20balbs%22~10.0^300.0)%20OR%20(categoryName_text_en_mv:%22halogan%20balbs%22~10.0^700.0)%20OR%20(type.search_text_mv:%22halogan%20balbs%22~10.0^800.0)%20OR%20(name_text_en:%22halogan%20balbs%22~10.0^500.0))&rows=20&spellcheck.dictionary=default&spellcheck.q=halogan%20balbs&spellcheck=true&spellcheck.collate=true&spellcheck.extendedResults=true&spellcheck.collateExtendedResults=true&spellcheck.count=100&spellcheck.maxCollationTries=500

that query returns the following:

"spellcheck":{
"suggestions":[
"halogan",{
"numFound":1,
"startOffset":0,
"endOffset":7,
"origFreq":0,
"suggestion":[{
"word":"halogen",
"freq":84}]},
"balb",{
"numFound":1,
"startOffset":8,
"endOffset":13,
"origFreq":0,
"suggestion":[{
"word":"bulb",
"freq":198}]}],
"correctlySpelled":false,
"collations":[]}}

Note that halogen and bulb is returned as single suggestion but collations
is empty, whilst if I run the query with "spellcheck.maxCollationTries=0"
then I get "halogen bulb" as suggested collation query:

"spellcheck":{
"suggestions":[
"halogan",{
"numFound":1,
"startOffset":0,
"endOffset":7,
"origFreq":0,
"suggestion":[{
"word":"halogen",
"freq":84}]},
"balb",{
"numFound":1,
"startOffset":8,
"endOffset":13,
"origFreq":0,
"suggestion":[{
"word":"bulb",
"freq":198}]}],
"correctlySpelled":false,
"collations":[
"collation",{
"collationQuery":"halogen bulb",
"hits":0,
"misspellingsAndCorrections":[
"halogan","halogen",
"balb","bulb"]}]}}

I would expect this behaviour to happen if searching for "halogen bulb"
returns 0 results, but in this particular case the search returns results:

http://localhost:8983/solr/master_Product_default/select?fq=(catalogId:%22ProductCatalog%22%20AND%20catalogVersion:%22Online%22)&q=((code_string:halogen^100.0))%20OR%20((code_string:bulb^100.0))%20OR%20((code_string:%22halogen%20bulb%22~10.0^100.0)%20OR%20(brand.search_text_mv:%22halogen%20bulb%22~10.0^300.0)%20OR%20(categoryName_text_en_mv:%22halogen%20bulb%22~10.0^700.0)%20OR%20(type.search_text_mv:%22halogen%20bulb%22~10.0^800.0)%20OR%20(name_text_en:%22halogen%20bulb%22~10.0^500.0))&rows=20&spellcheck.dictionary=default&spellcheck.q=halogen%20bulb&spellcheck=true&spellcheck.collate=true&spellcheck.extendedResults=true&spellcheck.collateExtendedResults=true&spellcheck.count=100&spellcheck.maxCollationTries=500

returns:

"response":{"numFound":42,"start":0,"docs":[
{...}





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html<http://lucene.472066.n3.nabble.com/Solr-User-f472068.html>