You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2020/10/03 04:03:38 UTC

[GitHub] [commons-text] kinow commented on a change in pull request #174: TEXT-188 Speed up LevenshteinDistance with threshold by exiting early

kinow commented on a change in pull request #174:
URL: https://github.com/apache/commons-text/pull/174#discussion_r499113488



##########
File path: src/main/java/org/apache/commons/text/similarity/LevenshteinDistance.java
##########
@@ -285,6 +286,11 @@ private static int limitedCompare(CharSequence left, CharSequence right, final i
                     // left and up
                     d[i] = 1 + Math.min(Math.min(d[i - 1], p[i]), p[i - 1]);
                 }
+                lowerBound = Math.min(lowerBound, d[i]);
+            }
+            // if the lower bound is greater than the threshold, then exit early
+            if (lowerBound > threshold) {
+                return -1;

Review comment:
       @vesterstroem it looks to me like this nullifies the `if (p[n] <= threshold) {` further down.
   
   I was looking at the [coverage report](https://coveralls.io/builds/33884935/source?filename=src/main/java/org/apache/commons/text/similarity/LevenshteinDistance.java), and it seems to be that the coverage decreased because that `return -1` never happens.
   
   If so, I think we can remove it and simply `return p[n];`. WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org