You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2018/09/07 19:50:00 UTC
[jira] [Commented] (LUCENE-8492) UnifiedHighlighter does not work
with Surround query parser (SurroundQParser)
[ https://issues.apache.org/jira/browse/LUCENE-8492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607596#comment-16607596 ]
David Smiley commented on LUCENE-8492:
--------------------------------------
Moved to Lucene. I tried the new WEIGHT_MATCHES flag and it does not solve this; ah well (was a long shot).
The UnifiedHighlighter ultimately sees no terms or automata (wildcard queries) and figures it's going to do nothing so yields a NoOpOffsetStrategy. It'd be nice if it could be told or if it could know that these attempts will be folly, and that the query might have anything (terms, wildcards, who-knows). Then, assuming as well it uses WEIGHT_MATCHES, it'd work. Not sure if some UH HighlightFlag is appropriate for that.
I looked at this again and thought of another solution: rewrite the query up-front, then highlight it. I don't think this highlighter should do this, although user-code could. On the Solr side, this could be done by adding getHighlightQuery() to SurroundQParserPlugin, overriding the default behavior to rewrite the parsed query. FWIW I have a trivial Solr test if someone wants to tackle that:
{code:java}
// in TestUnifiedSolrhighlighter at the bottom
public void testSurroundQParser() {
assertQ(req("q", "{!surround df=text}2w(second, document)", "hl", "true", "hl.fl", "text"),
"count(//lst[@name='highlighting']/lst[@name='102']/arr[@name='text']/*)=1");
}
{code}
> UnifiedHighlighter does not work with Surround query parser (SurroundQParser)
> -----------------------------------------------------------------------------
>
> Key: LUCENE-8492
> URL: https://issues.apache.org/jira/browse/LUCENE-8492
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/highlighter
> Affects Versions: 7.2.1
> Reporter: Andy Liu
> Priority: Major
> Attachments: TestUnifiedHighlighterSurround.java
>
>
> I'm attempting to use the UnifiedHighlighter in conjunction with queries parsed by Solr's SurroundQParserPlugin. When doing so, the response yields empty arrays for documents that should contain highlighted snippets.
> I've attached a test for UnifiedHighlighter that uses the surround's QueryParser and preprocesses the query in a similar fashion as SurroundQParser, which results in test failure. When creating a SpanQuery directly (rather via surround's QueryParser), the test passes.
> The problem can be isolated to the code path initiated by UnifiedHighlighter.extractTerms(), which uses EMPTY_INDEXSEARCHER to extract terms from the query. After a series of method calls, we end up at DistanceQuery.getSpanNearQuery(), where {{((DistanceSubQuery)sqi.next()).addSpanQueries(sncf)}} fails silently and doesn't add any span queries.
> Another data point: If I hack UnifiedHighlighter and pass in a live IndexSearcher to extractTerms(), highlighting works.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org