You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jim Ferenczi (JIRA)" <ji...@apache.org> on 2018/05/02 13:47:00 UTC

[jira] [Commented] (LUCENE-8288) ContextQuery "." for RegexCompletionQuery produces an assertion failure

    [ https://issues.apache.org/jira/browse/LUCENE-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461058#comment-16461058 ] 

Jim Ferenczi commented on LUCENE-8288:
--------------------------------------

Since it is possible to index suggestions with or without separators (preservePositionIncrements) the context query adds an optional separator after the context automaton. This character is optional so the regex "." can match the context plus the separator label but nothing from the real suggestions. Completion queries should always match a prefix from the suggestions (hence the assertion) but it doesn't handle regex that starts with ".". I've attached a patch to fix the issue that adds a parameter in the ContextQuery constructor to indicate if suggestions are indexed with position increments or not.This is a breaking change since it requires to match the value used for indexing but I don't see how to do it differently if we want to match regex that starts with any character accurately (e.g.: ".[s|t]").

> ContextQuery "." for RegexCompletionQuery produces an assertion failure
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-8288
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8288
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Julie Tibshirani
>            Priority: Major
>         Attachments: LUCENE-8288-repro.patch, LUCENE-8288.patch
>
>
> When a RegexCompletionQuery of "." is provided to ContextQuery, the following assertion failure occurs:
> {code:java}
> java.lang.AssertionError: input should not end with a context separator followed by SEP_LABEL
> at org.apache.lucene.search.suggest.document.ContextQuery$ContextCompletionWeight.setInnerWeight(ContextQuery.java:299)
> at org.apache.lucene.search.suggest.document.ContextQuery$ContextCompletionWeight.setNextMatch(ContextQuery.java:275)
> at org.apache.lucene.search.suggest.document.NRTSuggester.lookup(NRTSuggester.java:221)
> at org.apache.lucene.search.suggest.document.CompletionScorer.score(CompletionScorer.java:70)
> at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
> at org.apache.lucene.search.suggest.document.SuggestIndexSearcher.suggest(SuggestIndexSearcher.java:78)
> at org.apache.lucene.search.suggest.document.SuggestIndexSearcher.suggest(SuggestIndexSearcher.java:58)
> at org.apache.lucene.search.suggest.document.TestContextQuery.testDotRegexQuery(TestContextQuery.java:188)
> {code}
> Note that this is a related, but distinct issue from https://issues.apache.org/jira/browse/LUCENE-8287, where the RegexCompletionQuery is empty.
> The attached patch provides a reproduction of the issue, as the test case TestContextQuery#testRegexDotQuery. To reproduce, Java assertions must be enabled (as in the default configuration for tests).
> The patch also provides a test case for the normal behavior of an empty RegexCompletionQuery, when it is not wrapped in ContextQuery (TestRegexCompletionQuery#testRegexDotQuery). In this case, there is no error, and all suggestions are returned.
> From a quick look, it seems as though "." doesn't capture any characters past  CompletionAnalyzer.SEP_LABEL, so the matching prefix in ContextCompletionWeight#setInnerWeight is unexpectedly empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org