You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2020/03/25 04:19:00 UTC

[jira] [Moved] (LUCENE-9289) Speed up Levenshtein distance calculation when we don't need the exact distance

     [ https://issues.apache.org/jira/browse/LUCENE-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley moved SOLR-14360 to LUCENE-9289:
---------------------------------------------

              Key: LUCENE-9289  (was: SOLR-14360)
    Lucene Fields: New
          Project: Lucene - Core  (was: Solr)
         Security:     (was: Public)

> Speed up Levenshtein distance calculation when we don't need the exact distance
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-9289
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9289
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Andras Salamon
>            Priority: Minor
>         Attachments: SOLR-14360-01.patch
>
>
> Sometimes when we calculate the Levenshtein distance we don't need the exact distance, we only want to know if the strings are similar enough.
> [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/spelling/SolrSpellChecker.java#L113-L114]
> {noformat}
> sug.score = sd.getDistance(original, sug.string);        
> if (sug.score < min) continue; {noformat}
> If we use this threshold in the distance calculation, we can speed it up, we can stop the calculation when we already know that the the the distance will be lower than the threshold.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org