You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by kw...@apache.org on 2018/03/22 15:12:09 UTC

lucene-solr:branch_7x: LUCENE-8219: Do a better job of estimating automaton array sizes up front, to save on reallocation. Committed on behalf of Christian Ziech.

Repository: lucene-solr
Updated Branches:
  refs/heads/branch_7x 137dc1df2 -> 26d9a5ecd


LUCENE-8219: Do a better job of estimating automaton array sizes up front, to save on reallocation.  Committed on behalf of Christian Ziech.


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/26d9a5ec
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/26d9a5ec
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/26d9a5ec

Branch: refs/heads/branch_7x
Commit: 26d9a5ecd81c095d86334fffee93c02fdfca514f
Parents: 137dc1d
Author: Karl Wright <Da...@gmail.com>
Authored: Thu Mar 22 11:10:29 2018 -0400
Committer: Karl Wright <Da...@gmail.com>
Committed: Thu Mar 22 11:11:40 2018 -0400

----------------------------------------------------------------------
 lucene/CHANGES.txt                                             | 4 ++++
 .../org/apache/lucene/util/automaton/LevenshteinAutomata.java  | 6 ++++--
 2 files changed, 8 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/26d9a5ec/lucene/CHANGES.txt
----------------------------------------------------------------------
diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
index d9ad6c9..740a862 100644
--- a/lucene/CHANGES.txt
+++ b/lucene/CHANGES.txt
@@ -21,6 +21,10 @@ New Features
 
 Other
 
+* LUCENE-8219: Use a realistic estimate of the number of nodes and links in
+   LevensteinAutomaton.java, to save reallocation of arrays.
+   (Christian Ziech)
+
 * LUCENE-8214: Improve selection of testPoint for GeoComplexPolygon.
   (Ignacio Vera)
   

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/26d9a5ec/lucene/core/src/java/org/apache/lucene/util/automaton/LevenshteinAutomata.java
----------------------------------------------------------------------
diff --git a/lucene/core/src/java/org/apache/lucene/util/automaton/LevenshteinAutomata.java b/lucene/core/src/java/org/apache/lucene/util/automaton/LevenshteinAutomata.java
index 4a07f4b..bee3c00 100644
--- a/lucene/core/src/java/org/apache/lucene/util/automaton/LevenshteinAutomata.java
+++ b/lucene/core/src/java/org/apache/lucene/util/automaton/LevenshteinAutomata.java
@@ -152,9 +152,11 @@ public class LevenshteinAutomata {
     final int range = 2*n+1;
     ParametricDescription description = descriptions[n];
     // the number of states is based on the length of the word and n
-    int numStates = description.size();
+    final int numStates = description.size();
+    final int numTransitions = numStates * Math.min(1 + 2 * n, alphabet.length);
+    final int prefixStates = prefix != null ? prefix.codePointCount(0, prefix.length()) : 0;
 
-    Automaton a = new Automaton();
+    final Automaton a = new Automaton(numStates + prefixStates, numTransitions);
     int lastState;
     if (prefix != null) {
       // Insert prefix