You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Jakob Vesterstrøm (Jira)" <ji...@apache.org> on 2020/10/02 08:57:00 UTC

[jira] [Created] (TEXT-188) Speed up LevenshteinDistance with threshold

Jakob Vesterstrøm created TEXT-188:
--------------------------------------

             Summary: Speed up LevenshteinDistance with threshold
                 Key: TEXT-188
                 URL: https://issues.apache.org/jira/browse/TEXT-188
             Project: Commons Text
          Issue Type: Improvement
    Affects Versions: 1.9.1
            Reporter: Jakob Vesterstrøm
         Attachments: improvement.patch

The calculation made by the LevenshteinDistance class can often be made faster, when the class in initialized with a threshold, and when the distance is found to be larger than the threshold. In those cases, it is often not necessary to iterate through the whole string, since a lower bound for the result can be established after each iteration. If that lower bound is larger than the threshold, the method can simply exit early with the same result as without this improvement. 

A patch with the proposed change is attached to this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)