You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Julie Tibshirani (Jira)" <ji...@apache.org> on 2022/03/28 16:49:00 UTC

[jira] [Commented] (LUCENE-10454) UnifiedHighlighter can miss terms because of query rewrites

    [ https://issues.apache.org/jira/browse/LUCENE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513493#comment-17513493 ] 

Julie Tibshirani commented on LUCENE-10454:
-------------------------------------------

Thanks for looking into this! I don't understand UH deeply enough to weigh in on the patch, but generally I'd be supportive of a best-effort fix. We are also looking into switching to weightMatches=true in Elasticsearch.

> UnifiedHighlighter can miss terms because of query rewrites
> -----------------------------------------------------------
>
>                 Key: LUCENE-10454
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10454
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Julie Tibshirani
>            Priority: Minor
>         Attachments: LUCENE-10454-fix.patch, LUCENE-10454.patch
>
>
> Before extracting terms from a query, UnifiedHighlighter rewrites the query using an empty searcher. If the query rewrites to MatchNoDocsQuery when the reader is empty, then the highlighter will fail to extract terms. This is more of an issue now that we rewrite BooleanQuery to MatchNoDocsQuery when any of its required clauses is MatchNoDocsQuery (https://issues.apache.org/jira/browse/LUCENE-10412). I attached a patch showing the problem.
> This feels like a pretty esoteric issue, but I figured it was worth raising for awareness. I think it only applies when weightMatches=false, which isn't the default. I couldn't find any existing queries in Lucene that would be affected.
> We ran into it while upgrading Elasticsearch to the latest Lucene snapshot, since a couple custom queries rewrite to MatchNoDocsQuery when the reader is empty.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org