You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Cassandra Targett (Jira)" <ji...@apache.org> on 2021/09/17 15:22:00 UTC

[jira] [Commented] (SOLR-9060) Spellcheck sort by frequency in solrcloud

    [ https://issues.apache.org/jira/browse/SOLR-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17416758#comment-17416758 ] 

Cassandra Targett commented on SOLR-9060:
-----------------------------------------

While it's interesting that the behavior changes depending on if {{spellcheck.extendedResults}} is true or false, I don't think it ultimately changes whether or not we can sort by document frequency instead of distance score in SolrCloud mode.

Using the same example, I get the following no matter if I have {{comparatorClass}} set to {{freq}} or not (query term was "cort" against the techproducts example documents as in the previous comment):

{code}
    "suggestions":[
      "cort",{
        "numFound":4,
        "startOffset":0,
        "endOffset":4,
        "origFreq":0,
        "suggestion":[{
            "word":"corp",
            "freq":2},
          {
            "word":"cord",
            "freq":1},
          {
            "word":"card",
            "freq":4},
          {
            "word":"cook",
            "freq":1}]}]
{code}

I see the order is slightly different when {{spellcheck.extendedResults}} is false:

{code}
      "suggestions":[
      "cort",{
        "numFound":4,
        "startOffset":0,
        "endOffset":4,
        "suggestion":["cord",
          "corp",
          "card",
          "cook"]}]
{code}

But I know from the earlier query that "card" is the term with the most results so ultimately it still isn't sorting by overall frequency. I think maybe the difference when using the {{spellcheck.extendedResults}} param is a different bug.

But ultimately if I'd rather have "card" first because it has more hits, it appears that {{comparatorClass=freq}} just doesn't work in SolrCloud mode.

Just to show that the bug is limited to distributed queries, here's the order of terms on a single node with the default {{comparatorClass=score}}:

{code}
    "suggestions":[
      "cort",{
        "numFound":3,
        "startOffset":0,
        "endOffset":4,
        "origFreq":0,
        "suggestion":[{
            "word":"corp",
            "freq":2},
          {
            "word":"cord",
            "freq":1},
          {
            "word":"card",
            "freq":4}]}]
{code}

Single node with {{comparatorClass=freq}}:

{code}
     "suggestions":[
     "cort",{
        "numFound":3,
        "startOffset":0,
        "endOffset":4,
        "origFreq":0,
        "suggestion":[{
            "word":"card",
            "freq":4},
          {
            "word":"corp",
            "freq":2},
          {
            "word":"cord",
            "freq":1}]}]
{code}

> Spellcheck sort by frequency in solrcloud
> -----------------------------------------
>
>                 Key: SOLR-9060
>                 URL: https://issues.apache.org/jira/browse/SOLR-9060
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.3
>            Reporter: Gitanjali Palwe
>            Priority: Major
>         Attachments: spellcheck-sort-frequency.png
>
>
> The sorting by frequency for spellchecker doesn't work in solrcloud mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org