You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/05/01 09:49:05 UTC

[GitHub] [lucene-solr] thetaphi commented on a change in pull request #1467: LUCENE-9350: Don't hold references to large automata on FuzzyQuery

thetaphi commented on a change in pull request #1467:
URL: https://github.com/apache/lucene-solr/pull/1467#discussion_r418482288



##########
File path: lucene/core/src/java/org/apache/lucene/search/FuzzyTermsEnum.java
##########
@@ -88,43 +89,44 @@
    * @throws IOException if there is a low-level IO error
    */
   public FuzzyTermsEnum(Terms terms, Term term, int maxEdits, int prefixLength, boolean transpositions) throws IOException {
-    this(terms, term, stringToUTF32(term.text()), maxEdits, prefixLength, transpositions);
-  }
-
-  private FuzzyTermsEnum(Terms terms, Term term, int[] codePoints, int maxEdits, int prefixLength, boolean transpositions) throws IOException {
-    this(terms, new AttributeSource(), term, codePoints.length, maxEdits,
-        buildAutomata(term.text(), codePoints, prefixLength, transpositions, maxEdits));
+    this(terms, new AttributeSource(), term, () -> new FuzzyAutomatonBuilder(term.text(), maxEdits, prefixLength, transpositions));
   }
 
   /**
    * Constructor for enumeration of all terms from specified <code>reader</code> which share a prefix of
    * length <code>prefixLength</code> with <code>term</code> and which have at most {@code maxEdits} edits.
    * <p>
-   * After calling the constructor the enumeration is already pointing to the first 
-   * valid term if such a term exists. 
-   * 
+   * After calling the constructor the enumeration is already pointing to the first
+   * valid term if such a term exists.
+   *
    * @param terms Delivers terms.
-   * @param atts {@link AttributeSource} created by the rewrite method of {@link MultiTermQuery}
-   *              that contains information about competitive boosts during rewrite
+   * @param atts An AttributeSource used to share automata between segments
    * @param term Pattern term.
    * @param maxEdits Maximum edit distance.
-   * @param automata An array of levenshtein automata to match against terms,
-   *                 see {@link #buildAutomata(String, int[], int, boolean, int)}
+   * @param prefixLength the length of the required common prefix
+   * @param transpositions whether transpositions should count as a single edit
    * @throws IOException if there is a low-level IO error
    */
-  public FuzzyTermsEnum(Terms terms, AttributeSource atts, Term term, int termLength,
-      final int maxEdits, CompiledAutomaton[] automata) throws IOException {
+  FuzzyTermsEnum(Terms terms, AttributeSource atts, Term term, int maxEdits, int prefixLength, boolean transpositions) throws IOException {
+    this(terms, atts, term, () -> new FuzzyAutomatonBuilder(term.text(), maxEdits, prefixLength, transpositions));
+  }
+
+  private FuzzyTermsEnum(Terms terms, AttributeSource atts, Term term, Supplier<FuzzyAutomatonBuilder> automatonBuilder) throws IOException {
 
-    this.maxEdits = maxEdits;
     this.terms = terms;
-    this.term = term;
     this.atts = atts;
-    this.termLength = termLength;
+    this.term = term;
 
     this.maxBoostAtt = atts.addAttribute(MaxNonCompetitiveBoostAttribute.class);
     this.boostAtt = atts.addAttribute(BoostAttribute.class);
 
-    this.automata = automata;
+    atts.addAttributeImpl(new AutomatonAttributeImpl());

Review comment:
       Uh. You sure you tagged the right sentient here? xD




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org