You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/04 15:45:59 UTC

[GitHub] [lucene] dweiss commented on a change in pull request #427: LUCENE-10220: Add an utility method to get IntervalSource from analyzed text (or token stream)

dweiss commented on a change in pull request #427:
URL: https://github.com/apache/lucene/pull/427#discussion_r742966804



##########
File path: lucene/queries/src/java/org/apache/lucene/queries/intervals/Intervals.java
##########
@@ -429,4 +433,50 @@ public static IntervalsSource after(IntervalsSource source, IntervalsSource refe
         source,
         Intervals.extend(new OffsetIntervalsSource(reference, false), 0, Integer.MAX_VALUE));
   }
+
+  /**
+   * Returns intervals that correspond to tokens from a {@link TokenStream} returned for {@code
+   * text} by applying the provided {@link Analyzer} as if {@code text} was the content of the given
+   * {@code field}. The intervals can be ordered or unordered and can have optional gaps inside.
+   *
+   * @param text The text to analyze.
+   * @param analyzer The {@link Analyzer} to use to acquire a {@link TokenStream} which is then
+   *     converted into intervals.
+   * @param field The field {@code text} should be parsed as.
+   * @param maxGaps Maximum number of allowed gaps between sub-intervals resulting from tokens.
+   * @param ordered Whether sub-intervals should enforce token ordering or not.
+   * @return Returns an {@link IntervalsSource} that matches tokens acquired from analysis of {@code
+   *     text}. Possibly an empty interval source, never {@code null}.
+   * @throws IOException If an I/O exception occurs.
+   */
+  public static IntervalsSource analyzedText(
+      String text, Analyzer analyzer, String field, int maxGaps, boolean ordered)
+      throws IOException {
+    try (TokenStream ts = analyzer.tokenStream(field, text)) {
+      return analyzedText(ts, maxGaps, ordered);
+    }
+  }
+
+  /**
+   * Returns intervals that correspond to tokens from the provided {@link CachingTokenFilter}. This

Review comment:
       Thanks. Will fix. Forgot about the tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org