You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jack Krupansky (JIRA)" <ji...@apache.org> on 2013/08/16 02:41:54 UTC

[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight

    [ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741735#comment-13741735 ] 

Jack Krupansky commented on LUCENE-4734:
----------------------------------------

Looking at some highlighter code, I see this constructor in org.apache.lucene.search.vectorhighlight.FieldPhraseList.java of branch_4x:

{code}
/**
 * a constructor.
 * 
 * @param fieldTermStack FieldTermStack object
 * @param fieldQuery FieldQuery object
 * @param phraseLimit maximum size of phraseList
 */
public FieldPhraseList( FieldTermStack fieldTermStack, FieldQuery fieldQuery, int phraseLimit ){
  final String field = fieldTermStack.getFieldName();

  QueryPhraseMap qpm = fieldQuery.getRootMap(field);
  if (qpm != null) {
    LinkedList<TermInfo> phraseCandidate = new LinkedList<TermInfo>();
    extractPhrases(fieldTermStack.termList, qpm, phraseCandidate, 0);
    assert phraseCandidate.size() == 0;
  }
}
{code}

Clearly phraseLimit is no longer used. Is it being deprecated, or is this simply work in progress that will use it again eventually?

This parameter is passed over several layers of code, ultimately it is set up in Solr using the hl.phraseLimit parameter.

Seems like a "dead parameter" that should be cleaned up now or deprecated for future cleanup, but I can't say that I have been able to follow all of the work that has transpired in the highlighters.

The change occurred in Revision 1505732 (related to this Jira.) Before then, this parameter was used.

Comments? Or should this be a separate Jira issue?

                
> FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
> --------------------------------------------------------------------
>
>                 Key: LUCENE-4734
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4734
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: Ryan Lauck
>            Assignee: Adrien Grand
>              Labels: fastvectorhighlighter, highlighter
>             Fix For: 5.0, 4.5
>
>         Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch
>
>
> If a proximity phrase query overlaps with any other query term it will not be highlighted.
> Example Text:  A B C D E F G
> Example Queries: 
> "B E"~10 D
> (D will be highlighted instead of "B C D E")
> "B E"~10 "C F"~10
> (nothing will be highlighted)
> This can be traced to the FieldPhraseList constructor's inner while loop. From the first example query, the first TermInfo popped off the stack will be "B". The second TermInfo will be "D" which will not be found in the submap for "B E"~10 and will trigger a failed match.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org