You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Lucene/Solr QA (Jira)" <ji...@apache.org> on 2020/03/25 12:30:00 UTC

[jira] [Commented] (LUCENE-9289) Speed up Levenshtein distance calculation when we don't need the exact distance

    [ https://issues.apache.org/jira/browse/LUCENE-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066641#comment-17066641 ] 

Lucene/Solr QA commented on LUCENE-9289:
----------------------------------------

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m  0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green}  0m 23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green}  0m 20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green}  0m 20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 14s{color} | {color:green} suggest in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 45m 12s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 20s{color} | {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-9289 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12997538/SOLR-14360-01.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh |
| git revision | master / ad75916b6bd |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
|  Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/259/testReport/ |
| modules | C: lucene/suggest solr/core U: . |
| Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/259/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Speed up Levenshtein distance calculation when we don't need the exact distance
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-9289
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9289
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spellchecker
>            Reporter: Andras Salamon
>            Priority: Minor
>         Attachments: SOLR-14360-01.patch
>
>
> Sometimes when we calculate the Levenshtein distance we don't need the exact distance, we only want to know if the strings are similar enough.
> [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/spelling/SolrSpellChecker.java#L113-L114]
> {noformat}
> sug.score = sd.getDistance(original, sug.string);        
> if (sug.score < min) continue; {noformat}
> If we use this threshold in the distance calculation, we can speed it up, we can stop the calculation when we already know that the the the distance will be lower than the threshold.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org