You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Amrit Sarkar (JIRA)" <ji...@apache.org> on 2018/03/23 21:34:00 UTC

[jira] [Comment Edited] (SOLR-10513) CLONE - ConjunctionSolrSpellChecker wrong check for same string distance

    [ https://issues.apache.org/jira/browse/SOLR-10513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412114#comment-16412114 ] 

Amrit Sarkar edited comment on SOLR-10513 at 3/23/18 9:33 PM:
--------------------------------------------------------------

[~varunthacker]: thank you for the feedback.

bq. The test doesn't need to add a third spell checker and also should probably implement toString in LuceneLevenshteinDistance ?

Like {{LevenshteinDistance}}. We want to prove here that A,A,B {{StringDistance}} type will not go with each other, where A and B can be arbitrary. If we are verifying LuceneLevenshteinDistance of one spellchecker should match with other, it shouldn't with non-LuceneLevenshteinDistance stringDistance.

bq. This should be an init failure and not during a query. Maybe ConjunctionSolrSpellChecker could have a constructor that takes a stringDistance and accuracy and avoids all these checks? Let's spin that off in other Jira and discuss there.

bq. Also while we're working on this patch can we validate the accuracy/queryAnalyzer checks works as expected and have a test for that?

For both these points we need to extend / improve / diversify the ConjunctionSolrSpellCheckerTest which has MockSpellCheckers now. We can discuss.

P.S. forgot, thank Abhishek, you analysis solved the problem anyway.


was (Author: sarkaramrit2@gmail.com):
[~varunthacker]: thank you for the feedback.

bq. The test doesn't need to add a third spell checker and also should probably implement toString in LuceneLevenshteinDistance ?

Like {{LevenshteinDistance}}. We want to prove here that A,A,B {{StringDistance}} type will not go with each other, where A and B can be arbitrary, that's why. If we are verifying LuceneLevenshteinDistance of one spellchecker should match with other, it shouldn't with non-LuceneLevenshteinDistance stringDistance.

bq. This should be an init failure and not during a query. Maybe ConjunctionSolrSpellChecker could have a constructor that takes a stringDistance and accuracy and avoids all these checks? Let's spin that off in other Jira and discuss there.

bq. Also while we're working on this patch can we validate the accuracy/queryAnalyzer checks works as expected and have a test for that?

For both these points we need to extend / improve / diversify the ConjunctionSolrSpellCheckerTest which has MockSpellCheckers now. We can discuss.

P.S. forgot, thank Abhishek, you analysis solved the problem anyway.

> CLONE - ConjunctionSolrSpellChecker wrong check for same string distance
> ------------------------------------------------------------------------
>
>                 Key: SOLR-10513
>                 URL: https://issues.apache.org/jira/browse/SOLR-10513
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 4.9
>            Reporter: Abhishek Kumar Singh
>            Assignee: James Dyer
>            Priority: Major
>             Fix For: 5.5
>
>         Attachments: SOLR-10513.patch, SOLR-10513.patch
>
>
> See ConjunctionSolrSpellChecker.java
> try {
>       if (stringDistance == null) {
>         stringDistance = checker.getStringDistance();
>       } else if (stringDistance != checker.getStringDistance()) {
>         throw new IllegalArgumentException(
>             "All checkers need to use the same StringDistance.");
>       }
>     } catch (UnsupportedOperationException uoe) {
>       // ignore
>     }
> In line stringDistance != checker.getStringDistance() there is comparing by references. So if you are using 2 or more spellcheckers with same distance algorithm, exception will be thrown anyway.
> *Update:* As of Solr 6.5, this has been changed to *stringDistance.equals(checker.getStringDistance())* .
> However, *LuceneLevenshteinDistance* does not even override equals method. 
> This does not solve the problem yet, because the *default equals* method anyway compares references.
> Hence unable to use *FileBasedSolrSpellChecker* .  
> Moreover, Some check of similar sorts should also be in the init method. So that user does not have to wait for this error during query time. If the spellcheck components have been added *solrconfig.xml* , it should throw error during core-reload itself.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org