You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2016/12/07 00:23:23 UTC

Re: lucene-solr:branch_6x: LUCENE-7575: Add UnifiedHighlighter field matcher predicate (AKA requireFieldMatch=false)

David: something went haywire with your backport -- it added a 7.0.0 
section to CHANGES.txt, which is breaking the smoketester jenkins

: Date: Mon,  5 Dec 2016 21:21:19 +0000 (UTC)
: From: dsmiley@apache.org
: Reply-To: dev@lucene.apache.org
: To: commits@lucene.apache.org
: Subject: lucene-solr:branch_6x: LUCENE-7575: Add UnifiedHighlighter field
:     matcher predicate (AKA requireFieldMatch=false)
: 
: Repository: lucene-solr
: Updated Branches:
:   refs/heads/branch_6x cdce62108 -> 4e7a7dbf9
: 
: 
: LUCENE-7575: Add UnifiedHighlighter field matcher predicate (AKA requireFieldMatch=false)
: 
: (cherry picked from commit 2e948fe)
: 
: 
: Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
: Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/4e7a7dbf
: Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/4e7a7dbf
: Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/4e7a7dbf
: 
: Branch: refs/heads/branch_6x
: Commit: 4e7a7dbf9a56468f41e89f5289833081b27f1b14
: Parents: cdce621
: Author: David Smiley <ds...@apache.org>
: Authored: Mon Dec 5 16:11:57 2016 -0500
: Committer: David Smiley <ds...@apache.org>
: Committed: Mon Dec 5 16:21:12 2016 -0500
: 
: ----------------------------------------------------------------------
:  lucene/CHANGES.txt                              |  56 ++++
:  .../uhighlight/MemoryIndexOffsetStrategy.java   |  10 +-
:  .../uhighlight/MultiTermHighlighting.java       |  37 +--
:  .../lucene/search/uhighlight/PhraseHelper.java  | 158 ++++++++---
:  .../search/uhighlight/UnifiedHighlighter.java   |  64 +++--
:  .../uhighlight/TestUnifiedHighlighter.java      | 275 +++++++++++++++++++
:  .../TestUnifiedHighlighterExtensibility.java    |   3 +-
:  7 files changed, 519 insertions(+), 84 deletions(-)
: ----------------------------------------------------------------------
: 
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/CHANGES.txt
: ----------------------------------------------------------------------
: diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
: index b0a5f9c..853f171 100644
: --- a/lucene/CHANGES.txt
: +++ b/lucene/CHANGES.txt
: @@ -3,6 +3,57 @@ Lucene Change Log
:  For more information on past and future Lucene versions, please see:
:  http://s.apache.org/luceneversions
:  
: +======================= Lucene 7.0.0 =======================
: +
: +API Changes
: +
: +* LUCENE-2605: Classic QueryParser no longer splits on whitespace by default.
: +  Use setSplitOnWhitespace(true) to get the old behavior.  (Steve Rowe)
: +
: +* LUCENE-7369: Similarity.coord and BooleanQuery.disableCoord are removed.
: +  (Adrien Grand)
: +
: +* LUCENE-7368: Removed query normalization. (Adrien Grand)
: +
: +* LUCENE-7355: AnalyzingQueryParser has been removed as its functionality has
: +  been folded into the classic QueryParser. (Adrien Grand)
: +
: +* LUCENE-7407: Doc values APIs have been switched from random access
: +  to iterators, enabling future codec compression improvements. (Mike
: +  McCandless)
: +
: +* LUCENE-7475: Norms now support sparsity, allowing to pay for what is
: +  actually used. (Adrien Grand)
: +
: +* LUCENE-7494: Points now have a per-field API, like doc values. (Adrien Grand)
: +
: +Bug Fixes
: +
: +Improvements
: +
: +* LUCENE-7489: Better storage of sparse doc-values fields with the default
: +  codec. (Adrien Grand)
: +
: +Optimizations
: +
: +* LUCENE-7416: BooleanQuery optimizes queries that have queries that occur both
: +  in the sets of SHOULD and FILTER clauses, or both in MUST/FILTER and MUST_NOT
: +  clauses. (Spyros Kapnissis via Adrien Grand, Uwe Schindler)
: +
: +* LUCENE-7506: FastTaxonomyFacetCounts should use CPU in proportion to
: +  the size of the intersected set of hits from the query and documents
: +  that have a facet value, so sparse faceting works as expected
: +  (Adrien Grand via Mike McCandless)
: +
: +* LUCENE-7519: Add optimized APIs to compute browse-only top level
: +  facets (Mike McCandless)
: +
: +Other
: +
: +* LUCENE-7328: Remove LegacyNumericEncoding from GeoPointField. (Nick Knize)
: +
: +* LUCENE-7360: Remove Explanation.toHtml() (Alan Woodward)
: +
:  ======================= Lucene 6.4.0 =======================
:  
:  API Changes
: @@ -73,6 +124,11 @@ Improvements
:  * LUCENE-7537: Index time sorting now supports multi-valued sorts
:    using selectors (MIN, MAX, etc.) (Jim Ferenczi via Mike McCandless)
:  
: +* LUCENE-7575: UnifiedHighlighter can now highlight fields with queries that don't
: +  necessarily refer to that field (AKA requireFieldMatch==false). Disabled by default.
: +  See UH get/setFieldMatcher. (Jim Ferenczi via David Smiley)
: +
: +
:  Optimizations
:  
:  * LUCENE-7568: Optimize merging when index sorting is used but the
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
: ----------------------------------------------------------------------
: diff --git a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
: index 4028912..0001a80 100644
: --- a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
: +++ b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
: @@ -23,6 +23,7 @@ import java.util.Collection;
:  import java.util.Collections;
:  import java.util.List;
:  import java.util.function.Function;
: +import java.util.function.Predicate;
:  
:  import org.apache.lucene.analysis.Analyzer;
:  import org.apache.lucene.analysis.FilteringTokenFilter;
: @@ -49,7 +50,7 @@ public class MemoryIndexOffsetStrategy extends AnalysisOffsetStrategy {
:    private final LeafReader leafReader;
:    private final CharacterRunAutomaton preMemIndexFilterAutomaton;
:  
: -  public MemoryIndexOffsetStrategy(String field, BytesRef[] extractedTerms, PhraseHelper phraseHelper,
: +  public MemoryIndexOffsetStrategy(String field, Predicate<String> fieldMatcher, BytesRef[] extractedTerms, PhraseHelper phraseHelper,
:                                     CharacterRunAutomaton[] automata, Analyzer analyzer,
:                                     Function<Query, Collection<Query>> multiTermQueryRewrite) {
:      super(field, extractedTerms, phraseHelper, automata, analyzer);
: @@ -57,13 +58,14 @@ public class MemoryIndexOffsetStrategy extends AnalysisOffsetStrategy {
:      memoryIndex = new MemoryIndex(true, storePayloads);//true==store offsets
:      leafReader = (LeafReader) memoryIndex.createSearcher().getIndexReader(); // appears to be re-usable
:      // preFilter for MemoryIndex
: -    preMemIndexFilterAutomaton = buildCombinedAutomaton(field, terms, this.automata, phraseHelper, multiTermQueryRewrite);
: +    preMemIndexFilterAutomaton = buildCombinedAutomaton(fieldMatcher, terms, this.automata, phraseHelper, multiTermQueryRewrite);
:    }
:  
:    /**
:     * Build one {@link CharacterRunAutomaton} matching any term the query might match.
:     */
: -  private static CharacterRunAutomaton buildCombinedAutomaton(String field, BytesRef[] terms,
: +  private static CharacterRunAutomaton buildCombinedAutomaton(Predicate<String> fieldMatcher,
: +                                                              BytesRef[] terms,
:                                                                CharacterRunAutomaton[] automata,
:                                                                PhraseHelper strictPhrases,
:                                                                Function<Query, Collection<Query>> multiTermQueryRewrite) {
: @@ -74,7 +76,7 @@ public class MemoryIndexOffsetStrategy extends AnalysisOffsetStrategy {
:      Collections.addAll(allAutomata, automata);
:      for (SpanQuery spanQuery : strictPhrases.getSpanQueries()) {
:        Collections.addAll(allAutomata,
: -          MultiTermHighlighting.extractAutomata(spanQuery, field, true, multiTermQueryRewrite));//true==lookInSpan
: +          MultiTermHighlighting.extractAutomata(spanQuery, fieldMatcher, true, multiTermQueryRewrite));//true==lookInSpan
:      }
:  
:      if (allAutomata.size() == 1) {
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
: ----------------------------------------------------------------------
: diff --git a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
: index fd6a26a..267d603 100644
: --- a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
: +++ b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
: @@ -22,6 +22,7 @@ import java.util.Collection;
:  import java.util.Comparator;
:  import java.util.List;
:  import java.util.function.Function;
: +import java.util.function.Predicate;
:  
:  import org.apache.lucene.index.Term;
:  import org.apache.lucene.search.AutomatonQuery;
: @@ -56,50 +57,52 @@ class MultiTermHighlighting {
:    }
:  
:    /**
: -   * Extracts all MultiTermQueries for {@code field}, and returns equivalent
: -   * automata that will match terms.
: +   * Extracts MultiTermQueries that match the provided field predicate.
: +   * Returns equivalent automata that will match terms.
:     */
: -  public static CharacterRunAutomaton[] extractAutomata(Query query, String field, boolean lookInSpan,
: +  public static CharacterRunAutomaton[] extractAutomata(Query query,
: +                                                        Predicate<String> fieldMatcher,
: +                                                        boolean lookInSpan,
:                                                          Function<Query, Collection<Query>> preRewriteFunc) {
:      List<CharacterRunAutomaton> list = new ArrayList<>();
:      Collection<Query> customSubQueries = preRewriteFunc.apply(query);
:      if (customSubQueries != null) {
:        for (Query sub : customSubQueries) {
: -        list.addAll(Arrays.asList(extractAutomata(sub, field, lookInSpan, preRewriteFunc)));
: +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher, lookInSpan, preRewriteFunc)));
:        }
:      } else if (query instanceof BooleanQuery) {
:        for (BooleanClause clause : (BooleanQuery) query) {
:          if (!clause.isProhibited()) {
: -          list.addAll(Arrays.asList(extractAutomata(clause.getQuery(), field, lookInSpan, preRewriteFunc)));
: +          list.addAll(Arrays.asList(extractAutomata(clause.getQuery(), fieldMatcher, lookInSpan, preRewriteFunc)));
:          }
:        }
:      } else if (query instanceof ConstantScoreQuery) {
: -      list.addAll(Arrays.asList(extractAutomata(((ConstantScoreQuery) query).getQuery(), field, lookInSpan,
: +      list.addAll(Arrays.asList(extractAutomata(((ConstantScoreQuery) query).getQuery(), fieldMatcher, lookInSpan,
:            preRewriteFunc)));
:      } else if (query instanceof DisjunctionMaxQuery) {
:        for (Query sub : ((DisjunctionMaxQuery) query).getDisjuncts()) {
: -        list.addAll(Arrays.asList(extractAutomata(sub, field, lookInSpan, preRewriteFunc)));
: +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher, lookInSpan, preRewriteFunc)));
:        }
:      } else if (lookInSpan && query instanceof SpanOrQuery) {
:        for (Query sub : ((SpanOrQuery) query).getClauses()) {
: -        list.addAll(Arrays.asList(extractAutomata(sub, field, lookInSpan, preRewriteFunc)));
: +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher, lookInSpan, preRewriteFunc)));
:        }
:      } else if (lookInSpan && query instanceof SpanNearQuery) {
:        for (Query sub : ((SpanNearQuery) query).getClauses()) {
: -        list.addAll(Arrays.asList(extractAutomata(sub, field, lookInSpan, preRewriteFunc)));
: +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher, lookInSpan, preRewriteFunc)));
:        }
:      } else if (lookInSpan && query instanceof SpanNotQuery) {
: -      list.addAll(Arrays.asList(extractAutomata(((SpanNotQuery) query).getInclude(), field, lookInSpan,
: +      list.addAll(Arrays.asList(extractAutomata(((SpanNotQuery) query).getInclude(), fieldMatcher, lookInSpan,
:            preRewriteFunc)));
:      } else if (lookInSpan && query instanceof SpanPositionCheckQuery) {
: -      list.addAll(Arrays.asList(extractAutomata(((SpanPositionCheckQuery) query).getMatch(), field, lookInSpan,
: +      list.addAll(Arrays.asList(extractAutomata(((SpanPositionCheckQuery) query).getMatch(), fieldMatcher, lookInSpan,
:            preRewriteFunc)));
:      } else if (lookInSpan && query instanceof SpanMultiTermQueryWrapper) {
: -      list.addAll(Arrays.asList(extractAutomata(((SpanMultiTermQueryWrapper<?>) query).getWrappedQuery(), field,
: -          lookInSpan, preRewriteFunc)));
: +      list.addAll(Arrays.asList(extractAutomata(((SpanMultiTermQueryWrapper<?>) query).getWrappedQuery(),
: +          fieldMatcher, lookInSpan, preRewriteFunc)));
:      } else if (query instanceof AutomatonQuery) {
:        final AutomatonQuery aq = (AutomatonQuery) query;
: -      if (aq.getField().equals(field)) {
: +      if (fieldMatcher.test(aq.getField())) {
:          list.add(new CharacterRunAutomaton(aq.getAutomaton()) {
:            @Override
:            public String toString() {
: @@ -110,7 +113,7 @@ class MultiTermHighlighting {
:      } else if (query instanceof PrefixQuery) {
:        final PrefixQuery pq = (PrefixQuery) query;
:        Term prefix = pq.getPrefix();
: -      if (prefix.field().equals(field)) {
: +      if (fieldMatcher.test(prefix.field())) {
:          list.add(new CharacterRunAutomaton(Operations.concatenate(Automata.makeString(prefix.text()),
:              Automata.makeAnyString())) {
:            @Override
: @@ -121,7 +124,7 @@ class MultiTermHighlighting {
:        }
:      } else if (query instanceof FuzzyQuery) {
:        final FuzzyQuery fq = (FuzzyQuery) query;
: -      if (fq.getField().equals(field)) {
: +      if (fieldMatcher.test(fq.getField())) {
:          String utf16 = fq.getTerm().text();
:          int termText[] = new int[utf16.codePointCount(0, utf16.length())];
:          for (int cp, i = 0, j = 0; i < utf16.length(); i += Character.charCount(cp)) {
: @@ -142,7 +145,7 @@ class MultiTermHighlighting {
:        }
:      } else if (query instanceof TermRangeQuery) {
:        final TermRangeQuery tq = (TermRangeQuery) query;
: -      if (tq.getField().equals(field)) {
: +      if (fieldMatcher.test(tq.getField())) {
:          final CharsRef lowerBound;
:          if (tq.getLowerTerm() == null) {
:            lowerBound = null;
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
: ----------------------------------------------------------------------
: diff --git a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
: index 7693eb2..0c7897f 100644
: --- a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
: +++ b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
: @@ -16,17 +16,50 @@
:   */
:  package org.apache.lucene.search.uhighlight;
:  
: -import org.apache.lucene.index.*;
: -import org.apache.lucene.search.*;
: +import java.io.IOException;
: +import java.util.ArrayList;
: +import java.util.Arrays;
: +import java.util.Collection;
: +import java.util.Collections;
: +import java.util.Comparator;
: +import java.util.HashMap;
: +import java.util.HashSet;
: +import java.util.Iterator;
: +import java.util.LinkedHashSet;
: +import java.util.List;
: +import java.util.Map;
: +import java.util.PriorityQueue;
: +import java.util.Set;
: +import java.util.TreeSet;
: +import java.util.function.Function;
: +import java.util.function.Predicate;
: +
: +import org.apache.lucene.index.BinaryDocValues;
: +import org.apache.lucene.index.FieldInfos;
: +import org.apache.lucene.index.Fields;
: +import org.apache.lucene.index.FilterLeafReader;
: +import org.apache.lucene.index.LeafReader;
: +import org.apache.lucene.index.LeafReaderContext;
: +import org.apache.lucene.index.NumericDocValues;
: +import org.apache.lucene.index.PostingsEnum;
: +import org.apache.lucene.index.SortedDocValues;
: +import org.apache.lucene.index.Term;
: +import org.apache.lucene.index.Terms;
: +import org.apache.lucene.search.DocIdSetIterator;
: +import org.apache.lucene.search.IndexSearcher;
: +import org.apache.lucene.search.MatchAllDocsQuery;
: +import org.apache.lucene.search.MultiTermQuery;
: +import org.apache.lucene.search.Query;
: +import org.apache.lucene.search.TwoPhaseIterator;
:  import org.apache.lucene.search.highlight.WeightedSpanTerm;
:  import org.apache.lucene.search.highlight.WeightedSpanTermExtractor;
: -import org.apache.lucene.search.spans.*;
: +import org.apache.lucene.search.spans.SpanCollector;
: +import org.apache.lucene.search.spans.SpanMultiTermQueryWrapper;
: +import org.apache.lucene.search.spans.SpanQuery;
: +import org.apache.lucene.search.spans.SpanWeight;
: +import org.apache.lucene.search.spans.Spans;
:  import org.apache.lucene.util.BytesRef;
:  
: -import java.io.IOException;
: -import java.util.*;
: -import java.util.function.Function;
: -
:  /**
:   * Helps the {@link FieldOffsetStrategy} with strict position highlighting (e.g. highlight phrases correctly).
:   * This is a stateful class holding information about the query, but it can (and is) re-used across highlighting
: @@ -40,7 +73,7 @@ import java.util.function.Function;
:  public class PhraseHelper {
:  
:    public static final PhraseHelper NONE = new PhraseHelper(new MatchAllDocsQuery(), "_ignored_",
: -      spanQuery -> null, query -> null, true);
: +      (s) -> false, spanQuery -> null, query -> null, true);
:  
:    //TODO it seems this ought to be a general thing on Spans?
:    private static final Comparator<? super Spans> SPANS_COMPARATOR = (o1, o2) -> {
: @@ -59,10 +92,11 @@ public class PhraseHelper {
:      }
:    };
:  
: -  private final String fieldName; // if non-null, only look at queries/terms for this field
: +  private final String fieldName;
:    private final Set<Term> positionInsensitiveTerms; // (TermQuery terms)
:    private final Set<SpanQuery> spanQueries;
:    private final boolean willRewrite;
: +  private final Predicate<String> fieldMatcher;
:  
:    /**
:     * Constructor.
: @@ -73,14 +107,15 @@ public class PhraseHelper {
:     * to be set before the {@link WeightedSpanTermExtractor}'s extraction is invoked.
:     * {@code ignoreQueriesNeedingRewrite} effectively ignores any query clause that needs to be "rewritten", which is
:     * usually limited to just a {@link SpanMultiTermQueryWrapper} but could be other custom ones.
: +   * {@code fieldMatcher} The field name predicate to use for extracting the query part that must be highlighted.
:     */
: -  public PhraseHelper(Query query, String field, Function<SpanQuery, Boolean> rewriteQueryPred,
: +  public PhraseHelper(Query query, String field, Predicate<String> fieldMatcher, Function<SpanQuery, Boolean> rewriteQueryPred,
:                        Function<Query, Collection<Query>> preExtractRewriteFunction,
:                        boolean ignoreQueriesNeedingRewrite) {
: -    this.fieldName = field; // if null then don't require field match
: +    this.fieldName = field;
: +    this.fieldMatcher = fieldMatcher;
:      // filter terms to those we want
: -    positionInsensitiveTerms = field != null ? new FieldFilteringTermHashSet(field) : new HashSet<>();
: -    // requireFieldMatch optional
: +    positionInsensitiveTerms = new FieldFilteringTermSet();
:      spanQueries = new HashSet<>();
:  
:      // TODO Have toSpanQuery(query) Function as an extension point for those with custom Query impls
: @@ -131,11 +166,11 @@ public class PhraseHelper {
:        @Override
:        protected void extractWeightedSpanTerms(Map<String, WeightedSpanTerm> terms, SpanQuery spanQuery,
:                                                float boost) throws IOException {
: -        if (field != null) {
: -          // if this span query isn't for this field, skip it.
: -          Set<String> fieldNameSet = new HashSet<>();//TODO reuse.  note: almost always size 1
: -          collectSpanQueryFields(spanQuery, fieldNameSet);
: -          if (!fieldNameSet.contains(field)) {
: +        // if this span query isn't for this field, skip it.
: +        Set<String> fieldNameSet = new HashSet<>();//TODO reuse.  note: almost always size 1
: +        collectSpanQueryFields(spanQuery, fieldNameSet);
: +        for (String spanField : fieldNameSet) {
: +          if (!fieldMatcher.test(spanField)) {
:              return;
:            }
:          }
: @@ -190,10 +225,11 @@ public class PhraseHelper {
:      if (spanQueries.isEmpty()) {
:        return Collections.emptyMap();
:      }
: +    final LeafReader filteredReader = new SingleFieldFilterLeafReader(leafReader, fieldName);
:      // for each SpanQuery, collect the member spans into a map.
:      Map<BytesRef, Spans> result = new HashMap<>();
:      for (SpanQuery spanQuery : spanQueries) {
: -      getTermToSpans(spanQuery, leafReader.getContext(), doc, result);
: +      getTermToSpans(spanQuery, filteredReader.getContext(), doc, result);
:      }
:      return result;
:    }
: @@ -203,15 +239,14 @@ public class PhraseHelper {
:                                int doc, Map<BytesRef, Spans> result)
:        throws IOException {
:      // note: in WSTE there was some field specific looping that seemed pointless so that isn't here.
: -    final IndexSearcher searcher = new IndexSearcher(readerContext);
: +    final IndexSearcher searcher = new IndexSearcher(readerContext.reader());
:      searcher.setQueryCache(null);
:      if (willRewrite) {
:        spanQuery = (SpanQuery) searcher.rewrite(spanQuery); // searcher.rewrite loops till done
:      }
:  
:      // Get the underlying query terms
: -
: -    TreeSet<Term> termSet = new TreeSet<>(); // sorted so we can loop over results in order shortly...
: +    TreeSet<Term> termSet = new FieldFilteringTermSet(); // sorted so we can loop over results in order shortly...
:      searcher.createWeight(spanQuery, false).extractTerms(termSet);//needsScores==false
:  
:      // Get Spans by running the query against the reader
: @@ -240,9 +275,6 @@ public class PhraseHelper {
:      for (final Term queryTerm : termSet) {
:        // note: we expect that at least one query term will pass these filters. This is because the collected
:        //   spanQuery list were already filtered by these conditions.
: -      if (fieldName != null && fieldName.equals(queryTerm.field()) == false) {
: -        continue;
: -      }
:        if (positionInsensitiveTerms.contains(queryTerm)) {
:          continue;
:        }
: @@ -375,19 +407,17 @@ public class PhraseHelper {
:    }
:  
:    /**
: -   * Simple HashSet that filters out Terms not matching a desired field on {@code add()}.
: +   * Simple TreeSet that filters out Terms not matching the provided predicate on {@code add()}.
:     */
: -  private static class FieldFilteringTermHashSet extends HashSet<Term> {
: -    private final String field;
: -
: -    FieldFilteringTermHashSet(String field) {
: -      this.field = field;
: -    }
: -
: +  private class FieldFilteringTermSet extends TreeSet<Term> {
:      @Override
:      public boolean add(Term term) {
: -      if (term.field().equals(field)) {
: -        return super.add(term);
: +      if (fieldMatcher.test(term.field())) {
: +        if (term.field().equals(fieldName)) {
: +          return super.add(term);
: +        } else {
: +          return super.add(new Term(fieldName, term.bytes()));
: +        }
:        } else {
:          return false;
:        }
: @@ -500,6 +530,64 @@ public class PhraseHelper {
:    }
:  
:    /**
: +   * This reader will just delegate every call to a single field in the wrapped
: +   * LeafReader. This way we ensure that all queries going through this reader target the same field.
: +  */
: +  static final class SingleFieldFilterLeafReader extends FilterLeafReader {
: +    final String fieldName;
: +    SingleFieldFilterLeafReader(LeafReader in, String fieldName) {
: +      super(in);
: +      this.fieldName = fieldName;
: +    }
: +
: +    @Override
: +    public FieldInfos getFieldInfos() {
: +      throw new UnsupportedOperationException();
: +    }
: +
: +    @Override
: +    public Fields fields() throws IOException {
: +      return new FilterFields(super.fields()) {
: +        @Override
: +        public Terms terms(String field) throws IOException {
: +          return super.terms(fieldName);
: +        }
: +
: +        @Override
: +        public Iterator<String> iterator() {
: +          return Collections.singletonList(fieldName).iterator();
: +        }
: +
: +        @Override
: +        public int size() {
: +          return 1;
: +        }
: +      };
: +    }
: +
: +    @Override
: +    public NumericDocValues getNumericDocValues(String field) throws IOException {
: +      return super.getNumericDocValues(fieldName);
: +    }
: +
: +    @Override
: +    public BinaryDocValues getBinaryDocValues(String field) throws IOException {
: +      return super.getBinaryDocValues(fieldName);
: +    }
: +
: +    @Override
: +    public SortedDocValues getSortedDocValues(String field) throws IOException {
: +      return super.getSortedDocValues(fieldName);
: +    }
: +
: +    @Override
: +    public NumericDocValues getNormValues(String field) throws IOException {
: +      return super.getNormValues(fieldName);
: +    }
: +  }
: +
: +
: +  /**
:     * A Spans based on a list of cached spans for one doc.  It is pre-positioned to this doc.
:     */
:    private static class CachedSpans extends Spans {
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
: ----------------------------------------------------------------------
: diff --git a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
: index ac5f0f6..bbcfd5b 100644
: --- a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
: +++ b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
: @@ -24,6 +24,7 @@ import java.util.Arrays;
:  import java.util.Collection;
:  import java.util.EnumSet;
:  import java.util.HashMap;
: +import java.util.HashSet;
:  import java.util.List;
:  import java.util.Locale;
:  import java.util.Map;
: @@ -31,6 +32,7 @@ import java.util.Objects;
:  import java.util.Set;
:  import java.util.SortedSet;
:  import java.util.TreeSet;
: +import java.util.function.Predicate;
:  import java.util.function.Supplier;
:  
:  import org.apache.lucene.analysis.Analyzer;
: @@ -58,7 +60,6 @@ import org.apache.lucene.search.Weight;
:  import org.apache.lucene.search.spans.SpanQuery;
:  import org.apache.lucene.util.BytesRef;
:  import org.apache.lucene.util.InPlaceMergeSorter;
: -import org.apache.lucene.util.UnicodeUtil;
:  import org.apache.lucene.util.automaton.CharacterRunAutomaton;
:  
:  /**
: @@ -119,13 +120,13 @@ public class UnifiedHighlighter {
:  
:    private boolean defaultPassageRelevancyOverSpeed = true; //For analysis, prefer MemoryIndexOffsetStrategy
:  
: -  // private boolean defaultRequireFieldMatch = true; TODO
: -
:    private int maxLength = DEFAULT_MAX_LENGTH;
:  
:    // BreakIterator is stateful so we use a Supplier factory method
:    private Supplier<BreakIterator> defaultBreakIterator = () -> BreakIterator.getSentenceInstance(Locale.ROOT);
:  
: +  private Predicate<String> defaultFieldMatcher;
: +
:    private PassageScorer defaultScorer = new PassageScorer();
:  
:    private PassageFormatter defaultFormatter = new DefaultPassageFormatter();
: @@ -140,8 +141,8 @@ public class UnifiedHighlighter {
:    /**
:     * Calls {@link Weight#extractTerms(Set)} on an empty index for the query.
:     */
: -  protected static SortedSet<Term> extractTerms(Query query) throws IOException {
: -    SortedSet<Term> queryTerms = new TreeSet<>();
: +  protected static Set<Term> extractTerms(Query query) throws IOException {
: +    Set<Term> queryTerms = new HashSet<>();
:      EMPTY_INDEXSEARCHER.createNormalizedWeight(query, false).extractTerms(queryTerms);
:      return queryTerms;
:    }
: @@ -197,6 +198,10 @@ public class UnifiedHighlighter {
:      this.cacheFieldValCharsThreshold = cacheFieldValCharsThreshold;
:    }
:  
: +  public void setFieldMatcher(Predicate<String> predicate) {
: +    this.defaultFieldMatcher = predicate;
: +  }
: +
:    /**
:     * Returns whether {@link MultiTermQuery} derivatives will be highlighted.  By default it's enabled.  MTQ
:     * highlighting can be expensive, particularly when using offsets in postings.
: @@ -220,6 +225,18 @@ public class UnifiedHighlighter {
:      return defaultPassageRelevancyOverSpeed;
:    }
:  
: +  /**
: +   * Returns the predicate to use for extracting the query part that must be highlighted.
: +   * By default only queries that target the current field are kept. (AKA requireFieldMatch)
: +   */
: +  protected Predicate<String> getFieldMatcher(String field) {
: +    if (defaultFieldMatcher != null) {
: +      return defaultFieldMatcher;
: +    } else {
: +      // requireFieldMatch = true
: +      return (qf) -> field.equals(qf);
: +    }
: +  }
:  
:    /**
:     * The maximum content size to process.  Content will be truncated to this size before highlighting. Typically
: @@ -548,7 +565,7 @@ public class UnifiedHighlighter {
:      copyAndSortFieldsWithMaxPassages(fieldsIn, maxPassagesIn, fields, maxPassages); // latter 2 are "out" params
:  
:      // Init field highlighters (where most of the highlight logic lives, and on a per field basis)
: -    SortedSet<Term> queryTerms = extractTerms(query);
: +    Set<Term> queryTerms = extractTerms(query);
:      FieldHighlighter[] fieldHighlighters = new FieldHighlighter[fields.length];
:      int numTermVectors = 0;
:      int numPostings = 0;
: @@ -718,13 +735,13 @@ public class UnifiedHighlighter {
:            getClass().getSimpleName() + " without an IndexSearcher.");
:      }
:      Objects.requireNonNull(content, "content is required");
: -    SortedSet<Term> queryTerms = extractTerms(query);
: +    Set<Term> queryTerms = extractTerms(query);
:      return getFieldHighlighter(field, query, queryTerms, maxPassages)
:          .highlightFieldForDoc(null, -1, content);
:    }
:  
: -  protected FieldHighlighter getFieldHighlighter(String field, Query query, SortedSet<Term> allTerms, int maxPassages) {
: -    BytesRef[] terms = filterExtractedTerms(field, allTerms);
: +  protected FieldHighlighter getFieldHighlighter(String field, Query query, Set<Term> allTerms, int maxPassages) {
: +    BytesRef[] terms = filterExtractedTerms(getFieldMatcher(field), allTerms);
:      Set<HighlightFlag> highlightFlags = getFlags(field);
:      PhraseHelper phraseHelper = getPhraseHelper(field, query, highlightFlags);
:      CharacterRunAutomaton[] automata = getAutomata(field, query, highlightFlags);
: @@ -738,19 +755,15 @@ public class UnifiedHighlighter {
:          getFormatter(field));
:    }
:  
: -  protected static BytesRef[] filterExtractedTerms(String field, SortedSet<Term> queryTerms) {
: -    // TODO consider requireFieldMatch
: -    Term floor = new Term(field, "");
: -    Term ceiling = new Term(field, UnicodeUtil.BIG_TERM);
: -    SortedSet<Term> fieldTerms = queryTerms.subSet(floor, ceiling);
: -
: -    // Strip off the redundant field:
: -    BytesRef[] terms = new BytesRef[fieldTerms.size()];
: -    int termUpto = 0;
: -    for (Term term : fieldTerms) {
: -      terms[termUpto++] = term.bytes();
: +  protected static BytesRef[] filterExtractedTerms(Predicate<String> fieldMatcher, Set<Term> queryTerms) {
: +    // Strip off the redundant field and sort the remaining terms
: +    SortedSet<BytesRef> filteredTerms = new TreeSet<>();
: +    for (Term term : queryTerms) {
: +      if (fieldMatcher.test(term.field())) {
: +        filteredTerms.add(term.bytes());
: +      }
:      }
: -    return terms;
: +    return filteredTerms.toArray(new BytesRef[filteredTerms.size()]);
:    }
:  
:    protected Set<HighlightFlag> getFlags(String field) {
: @@ -771,14 +784,13 @@ public class UnifiedHighlighter {
:      boolean highlightPhrasesStrictly = highlightFlags.contains(HighlightFlag.PHRASES);
:      boolean handleMultiTermQuery = highlightFlags.contains(HighlightFlag.MULTI_TERM_QUERY);
:      return highlightPhrasesStrictly ?
: -        new PhraseHelper(query, field, this::requiresRewrite, this::preSpanQueryRewrite, !handleMultiTermQuery) :
: -        PhraseHelper.NONE;
: +        new PhraseHelper(query, field, getFieldMatcher(field),
: +            this::requiresRewrite, this::preSpanQueryRewrite, !handleMultiTermQuery) : PhraseHelper.NONE;
:    }
:  
:    protected CharacterRunAutomaton[] getAutomata(String field, Query query, Set<HighlightFlag> highlightFlags) {
:      return highlightFlags.contains(HighlightFlag.MULTI_TERM_QUERY)
: -        ? MultiTermHighlighting.extractAutomata(query, field, !highlightFlags.contains(HighlightFlag.PHRASES),
: -          this::preMultiTermQueryRewrite)
: +        ? MultiTermHighlighting.extractAutomata(query, getFieldMatcher(field), !highlightFlags.contains(HighlightFlag.PHRASES), this::preMultiTermQueryRewrite)
:          : ZERO_LEN_AUTOMATA_ARRAY;
:    }
:  
: @@ -826,7 +838,7 @@ public class UnifiedHighlighter {
:            //skip using a memory index since it's pure term filtering
:            return new TokenStreamOffsetStrategy(field, terms, phraseHelper, automata, getIndexAnalyzer());
:          } else {
: -          return new MemoryIndexOffsetStrategy(field, terms, phraseHelper, automata, getIndexAnalyzer(),
: +          return new MemoryIndexOffsetStrategy(field, getFieldMatcher(field), terms, phraseHelper, automata, getIndexAnalyzer(),
:                this::preMultiTermQueryRewrite);
:          }
:        case NONE_NEEDED:
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
: ----------------------------------------------------------------------
: diff --git a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
: index 0fd7d3d..ddf8a92 100644
: --- a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
: +++ b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
: @@ -25,6 +25,7 @@ import java.util.Arrays;
:  import java.util.Collections;
:  import java.util.List;
:  import java.util.Map;
: +import java.util.function.Predicate;
:  
:  import com.carrotsearch.randomizedtesting.annotations.ParametersFactory;
:  import org.apache.lucene.analysis.MockAnalyzer;
: @@ -32,14 +33,17 @@ import org.apache.lucene.analysis.MockTokenizer;
:  import org.apache.lucene.document.Document;
:  import org.apache.lucene.document.Field;
:  import org.apache.lucene.document.FieldType;
: +import org.apache.lucene.index.IndexOptions;
:  import org.apache.lucene.index.IndexReader;
:  import org.apache.lucene.index.RandomIndexWriter;
:  import org.apache.lucene.index.Term;
:  import org.apache.lucene.search.BooleanClause;
:  import org.apache.lucene.search.BooleanQuery;
:  import org.apache.lucene.search.DocIdSetIterator;
: +import org.apache.lucene.search.FuzzyQuery;
:  import org.apache.lucene.search.IndexSearcher;
:  import org.apache.lucene.search.PhraseQuery;
: +import org.apache.lucene.search.PrefixQuery;
:  import org.apache.lucene.search.Query;
:  import org.apache.lucene.search.ScoreDoc;
:  import org.apache.lucene.search.Sort;
: @@ -959,4 +963,275 @@ public class TestUnifiedHighlighter extends LuceneTestCase {
:      ir.close();
:    }
:  
: +  private IndexReader indexSomeFields() throws IOException {
: +    RandomIndexWriter iw = new RandomIndexWriter(random(), dir, indexAnalyzer);
: +    FieldType ft = new FieldType();
: +    ft.setIndexOptions(IndexOptions.NONE);
: +    ft.setTokenized(false);
: +    ft.setStored(true);
: +    ft.freeze();
: +
: +    Field title = new Field("title", "", fieldType);
: +    Field text = new Field("text", "", fieldType);
: +    Field category = new Field("category", "", fieldType);
: +
: +    Document doc = new Document();
: +    doc.add(title);
: +    doc.add(text);
: +    doc.add(category);
: +    title.setStringValue("This is the title field.");
: +    text.setStringValue("This is the text field. You can put some text if you want.");
: +    category.setStringValue("This is the category field.");
: +    iw.addDocument(doc);
: +
: +    IndexReader ir = iw.getReader();
: +    iw.close();
: +    return ir;
: +  }
: +
: +  public void testFieldMatcherTermQuery() throws Exception {
: +    IndexReader ir = indexSomeFields();
: +    IndexSearcher searcher = newSearcher(ir);
: +    UnifiedHighlighter highlighterNoFieldMatch = new UnifiedHighlighter(searcher, indexAnalyzer) {
: +      @Override
: +      protected Predicate<String> getFieldMatcher(String field) {
: +        // requireFieldMatch=false
: +        return (qf) -> true;
: +      }
: +    };
: +    UnifiedHighlighter highlighterFieldMatch = new UnifiedHighlighter(searcher, indexAnalyzer);
: +    BooleanQuery.Builder queryBuilder =
: +        new BooleanQuery.Builder()
: +            .add(new TermQuery(new Term("text", "some")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("text", "field")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("text", "this")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("title", "is")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("title", "this")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("category", "this")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("category", "some")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("category", "category")), BooleanClause.Occur.SHOULD);
: +    Query query = queryBuilder.build();
: +
: +    // title
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the title <b>field</b>.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the title field.", snippets[0]);
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> is the title <b>field</b>.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +
: +    // text
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the text <b>field</b>. You can put <b>some</b> text if you want.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> is the text <b>field</b>. You can put <b>some</b> text if you want.", snippets[0]);
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the text field. ", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +
: +    // category
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the <b>category</b> <b>field</b>.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> is the <b>category</b> field.", snippets[0]);
: +
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the category field.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +    ir.close();
: +  }
: +
: +  public void testFieldMatcherMultiTermQuery() throws Exception {
: +    IndexReader ir = indexSomeFields();
: +    IndexSearcher searcher = newSearcher(ir);
: +    UnifiedHighlighter highlighterNoFieldMatch = new UnifiedHighlighter(searcher, indexAnalyzer) {
: +      @Override
: +      protected Predicate<String> getFieldMatcher(String field) {
: +        // requireFieldMatch=false
: +        return (qf) -> true;
: +      }
: +    };
: +    UnifiedHighlighter highlighterFieldMatch = new UnifiedHighlighter(searcher, indexAnalyzer);
: +    BooleanQuery.Builder queryBuilder =
: +        new BooleanQuery.Builder()
: +            .add(new FuzzyQuery(new Term("text", "sime"), 1), BooleanClause.Occur.SHOULD)
: +            .add(new PrefixQuery(new Term("text", "fie")), BooleanClause.Occur.SHOULD)
: +            .add(new PrefixQuery(new Term("text", "thi")), BooleanClause.Occur.SHOULD)
: +            .add(new TermQuery(new Term("title", "is")), BooleanClause.Occur.SHOULD)
: +            .add(new PrefixQuery(new Term("title", "thi")), BooleanClause.Occur.SHOULD)
: +            .add(new PrefixQuery(new Term("category", "thi")), BooleanClause.Occur.SHOULD)
: +            .add(new FuzzyQuery(new Term("category", "sime"), 1), BooleanClause.Occur.SHOULD)
: +            .add(new PrefixQuery(new Term("category", "categ")), BooleanClause.Occur.SHOULD);
: +    Query query = queryBuilder.build();
: +
: +    // title
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the title <b>field</b>.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the title field.", snippets[0]);
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> is the title <b>field</b>.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +
: +    // text
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the text <b>field</b>. You can put <b>some</b> text if you want.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> is the text <b>field</b>. You can put <b>some</b> text if you want.", snippets[0]);
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the text field. ", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +
: +    // category
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the <b>category</b> <b>field</b>.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> is the <b>category</b> field.", snippets[0]);
: +
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the category field.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +    ir.close();
: +  }
: +
: +  public void testFieldMatcherPhraseQuery() throws Exception {
: +    IndexReader ir = indexSomeFields();
: +    IndexSearcher searcher = newSearcher(ir);
: +    UnifiedHighlighter highlighterNoFieldMatch = new UnifiedHighlighter(searcher, indexAnalyzer) {
: +      @Override
: +      protected Predicate<String> getFieldMatcher(String field) {
: +        // requireFieldMatch=false
: +        return (qf) -> true;
: +      }
: +    };
: +    UnifiedHighlighter highlighterFieldMatch = new UnifiedHighlighter(searcher, indexAnalyzer);
: +    BooleanQuery.Builder queryBuilder =
: +        new BooleanQuery.Builder()
: +            .add(new PhraseQuery("title", "this", "is", "the", "title"), BooleanClause.Occur.SHOULD)
: +            .add(new PhraseQuery(2, "category", "this", "is", "the", "field"), BooleanClause.Occur.SHOULD)
: +            .add(new PhraseQuery("text", "this", "is"), BooleanClause.Occur.SHOULD)
: +            .add(new PhraseQuery("category", "this", "is"), BooleanClause.Occur.SHOULD)
: +            .add(new PhraseQuery(1, "text", "you", "can", "put", "text"), BooleanClause.Occur.SHOULD);
: +    Query query = queryBuilder.build();
: +
: +    // title
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> <b>the</b> <b>title</b> <b>field</b>.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> <b>the</b> <b>title</b> field.", snippets[0]);
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("title", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the title field.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +
: +    // text
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> <b>the</b> <b>text</b> <b>field</b>. <b>You</b> <b>can</b> <b>put</b> some <b>text</b> if you want.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the <b>text</b> field. <b>You</b> <b>can</b> <b>put</b> some <b>text</b> if you want.", snippets[0]);
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("text", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("This is the text field. You can put some text if you want.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +
: +    // category
: +    {
: +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
: +      assertEquals(1, topDocs.totalHits);
: +      String[] snippets = highlighterNoFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> <b>the</b> category <b>field</b>.", snippets[0]);
: +
: +      snippets = highlighterFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> <b>the</b> category <b>field</b>.", snippets[0]);
: +
: +
: +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
: +      snippets = highlighterFieldMatch.highlight("category", query, topDocs, 10);
: +      assertEquals(1, snippets.length);
: +      assertEquals("<b>This</b> <b>is</b> the category field.", snippets[0]);
: +      highlighterFieldMatch.setFieldMatcher(null);
: +    }
: +    ir.close();
: +  }
:  }
: 
: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
: ----------------------------------------------------------------------
: diff --git a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
: index d150940..10757a5 100644
: --- a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
: +++ b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
: @@ -23,7 +23,6 @@ import java.util.Collections;
:  import java.util.List;
:  import java.util.Map;
:  import java.util.Set;
: -import java.util.SortedSet;
:  
:  import org.apache.lucene.analysis.Analyzer;
:  import org.apache.lucene.analysis.MockAnalyzer;
: @@ -144,7 +143,7 @@ public class TestUnifiedHighlighterExtensibility extends LuceneTestCase {
:        }
:  
:        @Override
: -      protected FieldHighlighter getFieldHighlighter(String field, Query query, SortedSet<Term> allTerms, int maxPassages) {
: +      protected FieldHighlighter getFieldHighlighter(String field, Query query, Set<Term> allTerms, int maxPassages) {
:          return super.getFieldHighlighter(field, query, allTerms, maxPassages);
:        }
:  
: 
: 

-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: lucene-solr:branch_6x: LUCENE-7575: Add UnifiedHighlighter field matcher predicate (AKA requireFieldMatch=false)

Posted by David Smiley <da...@gmail.com>.
Ouch; just fixed!  Thanks for bringing this to my attention.  Resolving
CHANGES.txt deltas is definitely the most error-prone part of back-porting
a cherry-pick'ed commit.

On Tue, Dec 6, 2016 at 7:23 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> David: something went haywire with your backport -- it added a 7.0.0
> section to CHANGES.txt, which is breaking the smoketester jenkins
>
> : Date: Mon,  5 Dec 2016 21:21:19 +0000 (UTC)
> : From: dsmiley@apache.org
> : Reply-To: dev@lucene.apache.org
> : To: commits@lucene.apache.org
> : Subject: lucene-solr:branch_6x: LUCENE-7575: Add UnifiedHighlighter field
> :     matcher predicate (AKA requireFieldMatch=false)
> :
> : Repository: lucene-solr
> : Updated Branches:
> :   refs/heads/branch_6x cdce62108 -> 4e7a7dbf9
> :
> :
> : LUCENE-7575: Add UnifiedHighlighter field matcher predicate (AKA
> requireFieldMatch=false)
> :
> : (cherry picked from commit 2e948fe)
> :
> :
> : Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
> : Commit:
> http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/4e7a7dbf
> : Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/4e7a7dbf
> : Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/4e7a7dbf
> :
> : Branch: refs/heads/branch_6x
> : Commit: 4e7a7dbf9a56468f41e89f5289833081b27f1b14
> : Parents: cdce621
> : Author: David Smiley <ds...@apache.org>
> : Authored: Mon Dec 5 16:11:57 2016 -0500
> : Committer: David Smiley <ds...@apache.org>
> : Committed: Mon Dec 5 16:21:12 2016 -0500
> :
> : ----------------------------------------------------------------------
> :  lucene/CHANGES.txt                              |  56 ++++
> :  .../uhighlight/MemoryIndexOffsetStrategy.java   |  10 +-
> :  .../uhighlight/MultiTermHighlighting.java       |  37 +--
> :  .../lucene/search/uhighlight/PhraseHelper.java  | 158 ++++++++---
> :  .../search/uhighlight/UnifiedHighlighter.java   |  64 +++--
> :  .../uhighlight/TestUnifiedHighlighter.java      | 275
> +++++++++++++++++++
> :  .../TestUnifiedHighlighterExtensibility.java    |   3 +-
> :  7 files changed, 519 insertions(+), 84 deletions(-)
> : ----------------------------------------------------------------------
> :
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/CHANGES.txt
> : ----------------------------------------------------------------------
> : diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
> : index b0a5f9c..853f171 100644
> : --- a/lucene/CHANGES.txt
> : +++ b/lucene/CHANGES.txt
> : @@ -3,6 +3,57 @@ Lucene Change Log
> :  For more information on past and future Lucene versions, please see:
> :  http://s.apache.org/luceneversions
> :
> : +======================= Lucene 7.0.0 =======================
> : +
> : +API Changes
> : +
> : +* LUCENE-2605: Classic QueryParser no longer splits on whitespace by
> default.
> : +  Use setSplitOnWhitespace(true) to get the old behavior.  (Steve Rowe)
> : +
> : +* LUCENE-7369: Similarity.coord and BooleanQuery.disableCoord are
> removed.
> : +  (Adrien Grand)
> : +
> : +* LUCENE-7368: Removed query normalization. (Adrien Grand)
> : +
> : +* LUCENE-7355: AnalyzingQueryParser has been removed as its
> functionality has
> : +  been folded into the classic QueryParser. (Adrien Grand)
> : +
> : +* LUCENE-7407: Doc values APIs have been switched from random access
> : +  to iterators, enabling future codec compression improvements. (Mike
> : +  McCandless)
> : +
> : +* LUCENE-7475: Norms now support sparsity, allowing to pay for what is
> : +  actually used. (Adrien Grand)
> : +
> : +* LUCENE-7494: Points now have a per-field API, like doc values.
> (Adrien Grand)
> : +
> : +Bug Fixes
> : +
> : +Improvements
> : +
> : +* LUCENE-7489: Better storage of sparse doc-values fields with the
> default
> : +  codec. (Adrien Grand)
> : +
> : +Optimizations
> : +
> : +* LUCENE-7416: BooleanQuery optimizes queries that have queries that
> occur both
> : +  in the sets of SHOULD and FILTER clauses, or both in MUST/FILTER and
> MUST_NOT
> : +  clauses. (Spyros Kapnissis via Adrien Grand, Uwe Schindler)
> : +
> : +* LUCENE-7506: FastTaxonomyFacetCounts should use CPU in proportion to
> : +  the size of the intersected set of hits from the query and documents
> : +  that have a facet value, so sparse faceting works as expected
> : +  (Adrien Grand via Mike McCandless)
> : +
> : +* LUCENE-7519: Add optimized APIs to compute browse-only top level
> : +  facets (Mike McCandless)
> : +
> : +Other
> : +
> : +* LUCENE-7328: Remove LegacyNumericEncoding from GeoPointField. (Nick
> Knize)
> : +
> : +* LUCENE-7360: Remove Explanation.toHtml() (Alan Woodward)
> : +
> :  ======================= Lucene 6.4.0 =======================
> :
> :  API Changes
> : @@ -73,6 +124,11 @@ Improvements
> :  * LUCENE-7537: Index time sorting now supports multi-valued sorts
> :    using selectors (MIN, MAX, etc.) (Jim Ferenczi via Mike McCandless)
> :
> : +* LUCENE-7575: UnifiedHighlighter can now highlight fields with queries
> that don't
> : +  necessarily refer to that field (AKA requireFieldMatch==false).
> Disabled by default.
> : +  See UH get/setFieldMatcher. (Jim Ferenczi via David Smiley)
> : +
> : +
> :  Optimizations
> :
> :  * LUCENE-7568: Optimize merging when index sorting is used but the
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
> : ----------------------------------------------------------------------
> : diff --git
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
> : index 4028912..0001a80 100644
> : ---
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
> : +++
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MemoryIndexOffsetStrategy.java
> : @@ -23,6 +23,7 @@ import java.util.Collection;
> :  import java.util.Collections;
> :  import java.util.List;
> :  import java.util.function.Function;
> : +import java.util.function.Predicate;
> :
> :  import org.apache.lucene.analysis.Analyzer;
> :  import org.apache.lucene.analysis.FilteringTokenFilter;
> : @@ -49,7 +50,7 @@ public class MemoryIndexOffsetStrategy extends
> AnalysisOffsetStrategy {
> :    private final LeafReader leafReader;
> :    private final CharacterRunAutomaton preMemIndexFilterAutomaton;
> :
> : -  public MemoryIndexOffsetStrategy(String field, BytesRef[]
> extractedTerms, PhraseHelper phraseHelper,
> : +  public MemoryIndexOffsetStrategy(String field, Predicate<String>
> fieldMatcher, BytesRef[] extractedTerms, PhraseHelper phraseHelper,
> :                                     CharacterRunAutomaton[] automata,
> Analyzer analyzer,
> :                                     Function<Query, Collection<Query>>
> multiTermQueryRewrite) {
> :      super(field, extractedTerms, phraseHelper, automata, analyzer);
> : @@ -57,13 +58,14 @@ public class MemoryIndexOffsetStrategy extends
> AnalysisOffsetStrategy {
> :      memoryIndex = new MemoryIndex(true, storePayloads);//true==store
> offsets
> :      leafReader = (LeafReader)
> memoryIndex.createSearcher().getIndexReader(); // appears to be re-usable
> :      // preFilter for MemoryIndex
> : -    preMemIndexFilterAutomaton = buildCombinedAutomaton(field, terms,
> this.automata, phraseHelper, multiTermQueryRewrite);
> : +    preMemIndexFilterAutomaton = buildCombinedAutomaton(fieldMatcher,
> terms, this.automata, phraseHelper, multiTermQueryRewrite);
> :    }
> :
> :    /**
> :     * Build one {@link CharacterRunAutomaton} matching any term the
> query might match.
> :     */
> : -  private static CharacterRunAutomaton buildCombinedAutomaton(String
> field, BytesRef[] terms,
> : +  private static CharacterRunAutomaton
> buildCombinedAutomaton(Predicate<String> fieldMatcher,
> : +
> BytesRef[] terms,
> :
> CharacterRunAutomaton[] automata,
> :
> PhraseHelper strictPhrases,
> :
> Function<Query, Collection<Query>> multiTermQueryRewrite) {
> : @@ -74,7 +76,7 @@ public class MemoryIndexOffsetStrategy extends
> AnalysisOffsetStrategy {
> :      Collections.addAll(allAutomata, automata);
> :      for (SpanQuery spanQuery : strictPhrases.getSpanQueries()) {
> :        Collections.addAll(allAutomata,
> : -          MultiTermHighlighting.extractAutomata(spanQuery, field, true,
> multiTermQueryRewrite));//true==lookInSpan
> : +          MultiTermHighlighting.extractAutomata(spanQuery,
> fieldMatcher, true, multiTermQueryRewrite));//true==lookInSpan
> :      }
> :
> :      if (allAutomata.size() == 1) {
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
> : ----------------------------------------------------------------------
> : diff --git
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
> : index fd6a26a..267d603 100644
> : ---
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
> : +++
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/MultiTermHighlighting.java
> : @@ -22,6 +22,7 @@ import java.util.Collection;
> :  import java.util.Comparator;
> :  import java.util.List;
> :  import java.util.function.Function;
> : +import java.util.function.Predicate;
> :
> :  import org.apache.lucene.index.Term;
> :  import org.apache.lucene.search.AutomatonQuery;
> : @@ -56,50 +57,52 @@ class MultiTermHighlighting {
> :    }
> :
> :    /**
> : -   * Extracts all MultiTermQueries for {@code field}, and returns
> equivalent
> : -   * automata that will match terms.
> : +   * Extracts MultiTermQueries that match the provided field predicate.
> : +   * Returns equivalent automata that will match terms.
> :     */
> : -  public static CharacterRunAutomaton[] extractAutomata(Query query,
> String field, boolean lookInSpan,
> : +  public static CharacterRunAutomaton[] extractAutomata(Query query,
> : +
> Predicate<String> fieldMatcher,
> : +                                                        boolean
> lookInSpan,
> :                                                          Function<Query,
> Collection<Query>> preRewriteFunc) {
> :      List<CharacterRunAutomaton> list = new ArrayList<>();
> :      Collection<Query> customSubQueries = preRewriteFunc.apply(query);
> :      if (customSubQueries != null) {
> :        for (Query sub : customSubQueries) {
> : -        list.addAll(Arrays.asList(extractAutomata(sub, field,
> lookInSpan, preRewriteFunc)));
> : +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher,
> lookInSpan, preRewriteFunc)));
> :        }
> :      } else if (query instanceof BooleanQuery) {
> :        for (BooleanClause clause : (BooleanQuery) query) {
> :          if (!clause.isProhibited()) {
> : -          list.addAll(Arrays.asList(extractAutomata(clause.getQuery(),
> field, lookInSpan, preRewriteFunc)));
> : +          list.addAll(Arrays.asList(extractAutomata(clause.getQuery(),
> fieldMatcher, lookInSpan, preRewriteFunc)));
> :          }
> :        }
> :      } else if (query instanceof ConstantScoreQuery) {
> : -      list.addAll(Arrays.asList(extractAutomata(((ConstantScoreQuery)
> query).getQuery(), field, lookInSpan,
> : +      list.addAll(Arrays.asList(extractAutomata(((ConstantScoreQuery)
> query).getQuery(), fieldMatcher, lookInSpan,
> :            preRewriteFunc)));
> :      } else if (query instanceof DisjunctionMaxQuery) {
> :        for (Query sub : ((DisjunctionMaxQuery) query).getDisjuncts()) {
> : -        list.addAll(Arrays.asList(extractAutomata(sub, field,
> lookInSpan, preRewriteFunc)));
> : +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher,
> lookInSpan, preRewriteFunc)));
> :        }
> :      } else if (lookInSpan && query instanceof SpanOrQuery) {
> :        for (Query sub : ((SpanOrQuery) query).getClauses()) {
> : -        list.addAll(Arrays.asList(extractAutomata(sub, field,
> lookInSpan, preRewriteFunc)));
> : +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher,
> lookInSpan, preRewriteFunc)));
> :        }
> :      } else if (lookInSpan && query instanceof SpanNearQuery) {
> :        for (Query sub : ((SpanNearQuery) query).getClauses()) {
> : -        list.addAll(Arrays.asList(extractAutomata(sub, field,
> lookInSpan, preRewriteFunc)));
> : +        list.addAll(Arrays.asList(extractAutomata(sub, fieldMatcher,
> lookInSpan, preRewriteFunc)));
> :        }
> :      } else if (lookInSpan && query instanceof SpanNotQuery) {
> : -      list.addAll(Arrays.asList(extractAutomata(((SpanNotQuery)
> query).getInclude(), field, lookInSpan,
> : +      list.addAll(Arrays.asList(extractAutomata(((SpanNotQuery)
> query).getInclude(), fieldMatcher, lookInSpan,
> :            preRewriteFunc)));
> :      } else if (lookInSpan && query instanceof SpanPositionCheckQuery) {
> : -
> list.addAll(Arrays.asList(extractAutomata(((SpanPositionCheckQuery)
> query).getMatch(), field, lookInSpan,
> : +
> list.addAll(Arrays.asList(extractAutomata(((SpanPositionCheckQuery)
> query).getMatch(), fieldMatcher, lookInSpan,
> :            preRewriteFunc)));
> :      } else if (lookInSpan && query instanceof
> SpanMultiTermQueryWrapper) {
> : -
> list.addAll(Arrays.asList(extractAutomata(((SpanMultiTermQueryWrapper<?>)
> query).getWrappedQuery(), field,
> : -          lookInSpan, preRewriteFunc)));
> : +
> list.addAll(Arrays.asList(extractAutomata(((SpanMultiTermQueryWrapper<?>)
> query).getWrappedQuery(),
> : +          fieldMatcher, lookInSpan, preRewriteFunc)));
> :      } else if (query instanceof AutomatonQuery) {
> :        final AutomatonQuery aq = (AutomatonQuery) query;
> : -      if (aq.getField().equals(field)) {
> : +      if (fieldMatcher.test(aq.getField())) {
> :          list.add(new CharacterRunAutomaton(aq.getAutomaton()) {
> :            @Override
> :            public String toString() {
> : @@ -110,7 +113,7 @@ class MultiTermHighlighting {
> :      } else if (query instanceof PrefixQuery) {
> :        final PrefixQuery pq = (PrefixQuery) query;
> :        Term prefix = pq.getPrefix();
> : -      if (prefix.field().equals(field)) {
> : +      if (fieldMatcher.test(prefix.field())) {
> :          list.add(new
> CharacterRunAutomaton(Operations.concatenate(Automata.makeString(prefix.text()),
> :              Automata.makeAnyString())) {
> :            @Override
> : @@ -121,7 +124,7 @@ class MultiTermHighlighting {
> :        }
> :      } else if (query instanceof FuzzyQuery) {
> :        final FuzzyQuery fq = (FuzzyQuery) query;
> : -      if (fq.getField().equals(field)) {
> : +      if (fieldMatcher.test(fq.getField())) {
> :          String utf16 = fq.getTerm().text();
> :          int termText[] = new int[utf16.codePointCount(0,
> utf16.length())];
> :          for (int cp, i = 0, j = 0; i < utf16.length(); i +=
> Character.charCount(cp)) {
> : @@ -142,7 +145,7 @@ class MultiTermHighlighting {
> :        }
> :      } else if (query instanceof TermRangeQuery) {
> :        final TermRangeQuery tq = (TermRangeQuery) query;
> : -      if (tq.getField().equals(field)) {
> : +      if (fieldMatcher.test(tq.getField())) {
> :          final CharsRef lowerBound;
> :          if (tq.getLowerTerm() == null) {
> :            lowerBound = null;
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
> : ----------------------------------------------------------------------
> : diff --git
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
> : index 7693eb2..0c7897f 100644
> : ---
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
> : +++
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/PhraseHelper.java
> : @@ -16,17 +16,50 @@
> :   */
> :  package org.apache.lucene.search.uhighlight;
> :
> : -import org.apache.lucene.index.*;
> : -import org.apache.lucene.search.*;
> : +import java.io.IOException;
> : +import java.util.ArrayList;
> : +import java.util.Arrays;
> : +import java.util.Collection;
> : +import java.util.Collections;
> : +import java.util.Comparator;
> : +import java.util.HashMap;
> : +import java.util.HashSet;
> : +import java.util.Iterator;
> : +import java.util.LinkedHashSet;
> : +import java.util.List;
> : +import java.util.Map;
> : +import java.util.PriorityQueue;
> : +import java.util.Set;
> : +import java.util.TreeSet;
> : +import java.util.function.Function;
> : +import java.util.function.Predicate;
> : +
> : +import org.apache.lucene.index.BinaryDocValues;
> : +import org.apache.lucene.index.FieldInfos;
> : +import org.apache.lucene.index.Fields;
> : +import org.apache.lucene.index.FilterLeafReader;
> : +import org.apache.lucene.index.LeafReader;
> : +import org.apache.lucene.index.LeafReaderContext;
> : +import org.apache.lucene.index.NumericDocValues;
> : +import org.apache.lucene.index.PostingsEnum;
> : +import org.apache.lucene.index.SortedDocValues;
> : +import org.apache.lucene.index.Term;
> : +import org.apache.lucene.index.Terms;
> : +import org.apache.lucene.search.DocIdSetIterator;
> : +import org.apache.lucene.search.IndexSearcher;
> : +import org.apache.lucene.search.MatchAllDocsQuery;
> : +import org.apache.lucene.search.MultiTermQuery;
> : +import org.apache.lucene.search.Query;
> : +import org.apache.lucene.search.TwoPhaseIterator;
> :  import org.apache.lucene.search.highlight.WeightedSpanTerm;
> :  import org.apache.lucene.search.highlight.WeightedSpanTermExtractor;
> : -import org.apache.lucene.search.spans.*;
> : +import org.apache.lucene.search.spans.SpanCollector;
> : +import org.apache.lucene.search.spans.SpanMultiTermQueryWrapper;
> : +import org.apache.lucene.search.spans.SpanQuery;
> : +import org.apache.lucene.search.spans.SpanWeight;
> : +import org.apache.lucene.search.spans.Spans;
> :  import org.apache.lucene.util.BytesRef;
> :
> : -import java.io.IOException;
> : -import java.util.*;
> : -import java.util.function.Function;
> : -
> :  /**
> :   * Helps the {@link FieldOffsetStrategy} with strict position
> highlighting (e.g. highlight phrases correctly).
> :   * This is a stateful class holding information about the query, but it
> can (and is) re-used across highlighting
> : @@ -40,7 +73,7 @@ import java.util.function.Function;
> :  public class PhraseHelper {
> :
> :    public static final PhraseHelper NONE = new PhraseHelper(new
> MatchAllDocsQuery(), "_ignored_",
> : -      spanQuery -> null, query -> null, true);
> : +      (s) -> false, spanQuery -> null, query -> null, true);
> :
> :    //TODO it seems this ought to be a general thing on Spans?
> :    private static final Comparator<? super Spans> SPANS_COMPARATOR =
> (o1, o2) -> {
> : @@ -59,10 +92,11 @@ public class PhraseHelper {
> :      }
> :    };
> :
> : -  private final String fieldName; // if non-null, only look at
> queries/terms for this field
> : +  private final String fieldName;
> :    private final Set<Term> positionInsensitiveTerms; // (TermQuery terms)
> :    private final Set<SpanQuery> spanQueries;
> :    private final boolean willRewrite;
> : +  private final Predicate<String> fieldMatcher;
> :
> :    /**
> :     * Constructor.
> : @@ -73,14 +107,15 @@ public class PhraseHelper {
> :     * to be set before the {@link WeightedSpanTermExtractor}'s
> extraction is invoked.
> :     * {@code ignoreQueriesNeedingRewrite} effectively ignores any query
> clause that needs to be "rewritten", which is
> :     * usually limited to just a {@link SpanMultiTermQueryWrapper} but
> could be other custom ones.
> : +   * {@code fieldMatcher} The field name predicate to use for
> extracting the query part that must be highlighted.
> :     */
> : -  public PhraseHelper(Query query, String field, Function<SpanQuery,
> Boolean> rewriteQueryPred,
> : +  public PhraseHelper(Query query, String field, Predicate<String>
> fieldMatcher, Function<SpanQuery, Boolean> rewriteQueryPred,
> :                        Function<Query, Collection<Query>>
> preExtractRewriteFunction,
> :                        boolean ignoreQueriesNeedingRewrite) {
> : -    this.fieldName = field; // if null then don't require field match
> : +    this.fieldName = field;
> : +    this.fieldMatcher = fieldMatcher;
> :      // filter terms to those we want
> : -    positionInsensitiveTerms = field != null ? new
> FieldFilteringTermHashSet(field) : new HashSet<>();
> : -    // requireFieldMatch optional
> : +    positionInsensitiveTerms = new FieldFilteringTermSet();
> :      spanQueries = new HashSet<>();
> :
> :      // TODO Have toSpanQuery(query) Function as an extension point for
> those with custom Query impls
> : @@ -131,11 +166,11 @@ public class PhraseHelper {
> :        @Override
> :        protected void extractWeightedSpanTerms(Map<String,
> WeightedSpanTerm> terms, SpanQuery spanQuery,
> :                                                float boost) throws
> IOException {
> : -        if (field != null) {
> : -          // if this span query isn't for this field, skip it.
> : -          Set<String> fieldNameSet = new HashSet<>();//TODO reuse.
> note: almost always size 1
> : -          collectSpanQueryFields(spanQuery, fieldNameSet);
> : -          if (!fieldNameSet.contains(field)) {
> : +        // if this span query isn't for this field, skip it.
> : +        Set<String> fieldNameSet = new HashSet<>();//TODO reuse.  note:
> almost always size 1
> : +        collectSpanQueryFields(spanQuery, fieldNameSet);
> : +        for (String spanField : fieldNameSet) {
> : +          if (!fieldMatcher.test(spanField)) {
> :              return;
> :            }
> :          }
> : @@ -190,10 +225,11 @@ public class PhraseHelper {
> :      if (spanQueries.isEmpty()) {
> :        return Collections.emptyMap();
> :      }
> : +    final LeafReader filteredReader = new
> SingleFieldFilterLeafReader(leafReader, fieldName);
> :      // for each SpanQuery, collect the member spans into a map.
> :      Map<BytesRef, Spans> result = new HashMap<>();
> :      for (SpanQuery spanQuery : spanQueries) {
> : -      getTermToSpans(spanQuery, leafReader.getContext(), doc, result);
> : +      getTermToSpans(spanQuery, filteredReader.getContext(), doc,
> result);
> :      }
> :      return result;
> :    }
> : @@ -203,15 +239,14 @@ public class PhraseHelper {
> :                                int doc, Map<BytesRef, Spans> result)
> :        throws IOException {
> :      // note: in WSTE there was some field specific looping that seemed
> pointless so that isn't here.
> : -    final IndexSearcher searcher = new IndexSearcher(readerContext);
> : +    final IndexSearcher searcher = new
> IndexSearcher(readerContext.reader());
> :      searcher.setQueryCache(null);
> :      if (willRewrite) {
> :        spanQuery = (SpanQuery) searcher.rewrite(spanQuery); //
> searcher.rewrite loops till done
> :      }
> :
> :      // Get the underlying query terms
> : -
> : -    TreeSet<Term> termSet = new TreeSet<>(); // sorted so we can loop
> over results in order shortly...
> : +    TreeSet<Term> termSet = new FieldFilteringTermSet(); // sorted so
> we can loop over results in order shortly...
> :      searcher.createWeight(spanQuery,
> false).extractTerms(termSet);//needsScores==false
> :
> :      // Get Spans by running the query against the reader
> : @@ -240,9 +275,6 @@ public class PhraseHelper {
> :      for (final Term queryTerm : termSet) {
> :        // note: we expect that at least one query term will pass these
> filters. This is because the collected
> :        //   spanQuery list were already filtered by these conditions.
> : -      if (fieldName != null && fieldName.equals(queryTerm.field()) ==
> false) {
> : -        continue;
> : -      }
> :        if (positionInsensitiveTerms.contains(queryTerm)) {
> :          continue;
> :        }
> : @@ -375,19 +407,17 @@ public class PhraseHelper {
> :    }
> :
> :    /**
> : -   * Simple HashSet that filters out Terms not matching a desired field
> on {@code add()}.
> : +   * Simple TreeSet that filters out Terms not matching the provided
> predicate on {@code add()}.
> :     */
> : -  private static class FieldFilteringTermHashSet extends HashSet<Term> {
> : -    private final String field;
> : -
> : -    FieldFilteringTermHashSet(String field) {
> : -      this.field = field;
> : -    }
> : -
> : +  private class FieldFilteringTermSet extends TreeSet<Term> {
> :      @Override
> :      public boolean add(Term term) {
> : -      if (term.field().equals(field)) {
> : -        return super.add(term);
> : +      if (fieldMatcher.test(term.field())) {
> : +        if (term.field().equals(fieldName)) {
> : +          return super.add(term);
> : +        } else {
> : +          return super.add(new Term(fieldName, term.bytes()));
> : +        }
> :        } else {
> :          return false;
> :        }
> : @@ -500,6 +530,64 @@ public class PhraseHelper {
> :    }
> :
> :    /**
> : +   * This reader will just delegate every call to a single field in the
> wrapped
> : +   * LeafReader. This way we ensure that all queries going through this
> reader target the same field.
> : +  */
> : +  static final class SingleFieldFilterLeafReader extends
> FilterLeafReader {
> : +    final String fieldName;
> : +    SingleFieldFilterLeafReader(LeafReader in, String fieldName) {
> : +      super(in);
> : +      this.fieldName = fieldName;
> : +    }
> : +
> : +    @Override
> : +    public FieldInfos getFieldInfos() {
> : +      throw new UnsupportedOperationException();
> : +    }
> : +
> : +    @Override
> : +    public Fields fields() throws IOException {
> : +      return new FilterFields(super.fields()) {
> : +        @Override
> : +        public Terms terms(String field) throws IOException {
> : +          return super.terms(fieldName);
> : +        }
> : +
> : +        @Override
> : +        public Iterator<String> iterator() {
> : +          return Collections.singletonList(fieldName).iterator();
> : +        }
> : +
> : +        @Override
> : +        public int size() {
> : +          return 1;
> : +        }
> : +      };
> : +    }
> : +
> : +    @Override
> : +    public NumericDocValues getNumericDocValues(String field) throws
> IOException {
> : +      return super.getNumericDocValues(fieldName);
> : +    }
> : +
> : +    @Override
> : +    public BinaryDocValues getBinaryDocValues(String field) throws
> IOException {
> : +      return super.getBinaryDocValues(fieldName);
> : +    }
> : +
> : +    @Override
> : +    public SortedDocValues getSortedDocValues(String field) throws
> IOException {
> : +      return super.getSortedDocValues(fieldName);
> : +    }
> : +
> : +    @Override
> : +    public NumericDocValues getNormValues(String field) throws
> IOException {
> : +      return super.getNormValues(fieldName);
> : +    }
> : +  }
> : +
> : +
> : +  /**
> :     * A Spans based on a list of cached spans for one doc.  It is
> pre-positioned to this doc.
> :     */
> :    private static class CachedSpans extends Spans {
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
> : ----------------------------------------------------------------------
> : diff --git
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
> : index ac5f0f6..bbcfd5b 100644
> : ---
> a/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
> : +++
> b/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
> : @@ -24,6 +24,7 @@ import java.util.Arrays;
> :  import java.util.Collection;
> :  import java.util.EnumSet;
> :  import java.util.HashMap;
> : +import java.util.HashSet;
> :  import java.util.List;
> :  import java.util.Locale;
> :  import java.util.Map;
> : @@ -31,6 +32,7 @@ import java.util.Objects;
> :  import java.util.Set;
> :  import java.util.SortedSet;
> :  import java.util.TreeSet;
> : +import java.util.function.Predicate;
> :  import java.util.function.Supplier;
> :
> :  import org.apache.lucene.analysis.Analyzer;
> : @@ -58,7 +60,6 @@ import org.apache.lucene.search.Weight;
> :  import org.apache.lucene.search.spans.SpanQuery;
> :  import org.apache.lucene.util.BytesRef;
> :  import org.apache.lucene.util.InPlaceMergeSorter;
> : -import org.apache.lucene.util.UnicodeUtil;
> :  import org.apache.lucene.util.automaton.CharacterRunAutomaton;
> :
> :  /**
> : @@ -119,13 +120,13 @@ public class UnifiedHighlighter {
> :
> :    private boolean defaultPassageRelevancyOverSpeed = true; //For
> analysis, prefer MemoryIndexOffsetStrategy
> :
> : -  // private boolean defaultRequireFieldMatch = true; TODO
> : -
> :    private int maxLength = DEFAULT_MAX_LENGTH;
> :
> :    // BreakIterator is stateful so we use a Supplier factory method
> :    private Supplier<BreakIterator> defaultBreakIterator = () ->
> BreakIterator.getSentenceInstance(Locale.ROOT);
> :
> : +  private Predicate<String> defaultFieldMatcher;
> : +
> :    private PassageScorer defaultScorer = new PassageScorer();
> :
> :    private PassageFormatter defaultFormatter = new
> DefaultPassageFormatter();
> : @@ -140,8 +141,8 @@ public class UnifiedHighlighter {
> :    /**
> :     * Calls {@link Weight#extractTerms(Set)} on an empty index for the
> query.
> :     */
> : -  protected static SortedSet<Term> extractTerms(Query query) throws
> IOException {
> : -    SortedSet<Term> queryTerms = new TreeSet<>();
> : +  protected static Set<Term> extractTerms(Query query) throws
> IOException {
> : +    Set<Term> queryTerms = new HashSet<>();
> :      EMPTY_INDEXSEARCHER.createNormalizedWeight(query,
> false).extractTerms(queryTerms);
> :      return queryTerms;
> :    }
> : @@ -197,6 +198,10 @@ public class UnifiedHighlighter {
> :      this.cacheFieldValCharsThreshold = cacheFieldValCharsThreshold;
> :    }
> :
> : +  public void setFieldMatcher(Predicate<String> predicate) {
> : +    this.defaultFieldMatcher = predicate;
> : +  }
> : +
> :    /**
> :     * Returns whether {@link MultiTermQuery} derivatives will be
> highlighted.  By default it's enabled.  MTQ
> :     * highlighting can be expensive, particularly when using offsets in
> postings.
> : @@ -220,6 +225,18 @@ public class UnifiedHighlighter {
> :      return defaultPassageRelevancyOverSpeed;
> :    }
> :
> : +  /**
> : +   * Returns the predicate to use for extracting the query part that
> must be highlighted.
> : +   * By default only queries that target the current field are kept.
> (AKA requireFieldMatch)
> : +   */
> : +  protected Predicate<String> getFieldMatcher(String field) {
> : +    if (defaultFieldMatcher != null) {
> : +      return defaultFieldMatcher;
> : +    } else {
> : +      // requireFieldMatch = true
> : +      return (qf) -> field.equals(qf);
> : +    }
> : +  }
> :
> :    /**
> :     * The maximum content size to process.  Content will be truncated to
> this size before highlighting. Typically
> : @@ -548,7 +565,7 @@ public class UnifiedHighlighter {
> :      copyAndSortFieldsWithMaxPassages(fieldsIn, maxPassagesIn, fields,
> maxPassages); // latter 2 are "out" params
> :
> :      // Init field highlighters (where most of the highlight logic
> lives, and on a per field basis)
> : -    SortedSet<Term> queryTerms = extractTerms(query);
> : +    Set<Term> queryTerms = extractTerms(query);
> :      FieldHighlighter[] fieldHighlighters = new
> FieldHighlighter[fields.length];
> :      int numTermVectors = 0;
> :      int numPostings = 0;
> : @@ -718,13 +735,13 @@ public class UnifiedHighlighter {
> :            getClass().getSimpleName() + " without an IndexSearcher.");
> :      }
> :      Objects.requireNonNull(content, "content is required");
> : -    SortedSet<Term> queryTerms = extractTerms(query);
> : +    Set<Term> queryTerms = extractTerms(query);
> :      return getFieldHighlighter(field, query, queryTerms, maxPassages)
> :          .highlightFieldForDoc(null, -1, content);
> :    }
> :
> : -  protected FieldHighlighter getFieldHighlighter(String field, Query
> query, SortedSet<Term> allTerms, int maxPassages) {
> : -    BytesRef[] terms = filterExtractedTerms(field, allTerms);
> : +  protected FieldHighlighter getFieldHighlighter(String field, Query
> query, Set<Term> allTerms, int maxPassages) {
> : +    BytesRef[] terms = filterExtractedTerms(getFieldMatcher(field),
> allTerms);
> :      Set<HighlightFlag> highlightFlags = getFlags(field);
> :      PhraseHelper phraseHelper = getPhraseHelper(field, query,
> highlightFlags);
> :      CharacterRunAutomaton[] automata = getAutomata(field, query,
> highlightFlags);
> : @@ -738,19 +755,15 @@ public class UnifiedHighlighter {
> :          getFormatter(field));
> :    }
> :
> : -  protected static BytesRef[] filterExtractedTerms(String field,
> SortedSet<Term> queryTerms) {
> : -    // TODO consider requireFieldMatch
> : -    Term floor = new Term(field, "");
> : -    Term ceiling = new Term(field, UnicodeUtil.BIG_TERM);
> : -    SortedSet<Term> fieldTerms = queryTerms.subSet(floor, ceiling);
> : -
> : -    // Strip off the redundant field:
> : -    BytesRef[] terms = new BytesRef[fieldTerms.size()];
> : -    int termUpto = 0;
> : -    for (Term term : fieldTerms) {
> : -      terms[termUpto++] = term.bytes();
> : +  protected static BytesRef[] filterExtractedTerms(Predicate<String>
> fieldMatcher, Set<Term> queryTerms) {
> : +    // Strip off the redundant field and sort the remaining terms
> : +    SortedSet<BytesRef> filteredTerms = new TreeSet<>();
> : +    for (Term term : queryTerms) {
> : +      if (fieldMatcher.test(term.field())) {
> : +        filteredTerms.add(term.bytes());
> : +      }
> :      }
> : -    return terms;
> : +    return filteredTerms.toArray(new BytesRef[filteredTerms.size()]);
> :    }
> :
> :    protected Set<HighlightFlag> getFlags(String field) {
> : @@ -771,14 +784,13 @@ public class UnifiedHighlighter {
> :      boolean highlightPhrasesStrictly =
> highlightFlags.contains(HighlightFlag.PHRASES);
> :      boolean handleMultiTermQuery =
> highlightFlags.contains(HighlightFlag.MULTI_TERM_QUERY);
> :      return highlightPhrasesStrictly ?
> : -        new PhraseHelper(query, field, this::requiresRewrite,
> this::preSpanQueryRewrite, !handleMultiTermQuery) :
> : -        PhraseHelper.NONE;
> : +        new PhraseHelper(query, field, getFieldMatcher(field),
> : +            this::requiresRewrite, this::preSpanQueryRewrite,
> !handleMultiTermQuery) : PhraseHelper.NONE;
> :    }
> :
> :    protected CharacterRunAutomaton[] getAutomata(String field, Query
> query, Set<HighlightFlag> highlightFlags) {
> :      return highlightFlags.contains(HighlightFlag.MULTI_TERM_QUERY)
> : -        ? MultiTermHighlighting.extractAutomata(query, field,
> !highlightFlags.contains(HighlightFlag.PHRASES),
> : -          this::preMultiTermQueryRewrite)
> : +        ? MultiTermHighlighting.extractAutomata(query,
> getFieldMatcher(field), !highlightFlags.contains(HighlightFlag.PHRASES),
> this::preMultiTermQueryRewrite)
> :          : ZERO_LEN_AUTOMATA_ARRAY;
> :    }
> :
> : @@ -826,7 +838,7 @@ public class UnifiedHighlighter {
> :            //skip using a memory index since it's pure term filtering
> :            return new TokenStreamOffsetStrategy(field, terms,
> phraseHelper, automata, getIndexAnalyzer());
> :          } else {
> : -          return new MemoryIndexOffsetStrategy(field, terms,
> phraseHelper, automata, getIndexAnalyzer(),
> : +          return new MemoryIndexOffsetStrategy(field,
> getFieldMatcher(field), terms, phraseHelper, automata, getIndexAnalyzer(),
> :                this::preMultiTermQueryRewrite);
> :          }
> :        case NONE_NEEDED:
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
> : ----------------------------------------------------------------------
> : diff --git
> a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
> b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
> : index 0fd7d3d..ddf8a92 100644
> : ---
> a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
> : +++
> b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/TestUnifiedHighlighter.java
> : @@ -25,6 +25,7 @@ import java.util.Arrays;
> :  import java.util.Collections;
> :  import java.util.List;
> :  import java.util.Map;
> : +import java.util.function.Predicate;
> :
> :  import com.carrotsearch.randomizedtesting.annotations.ParametersFactory;
> :  import org.apache.lucene.analysis.MockAnalyzer;
> : @@ -32,14 +33,17 @@ import org.apache.lucene.analysis.MockTokenizer;
> :  import org.apache.lucene.document.Document;
> :  import org.apache.lucene.document.Field;
> :  import org.apache.lucene.document.FieldType;
> : +import org.apache.lucene.index.IndexOptions;
> :  import org.apache.lucene.index.IndexReader;
> :  import org.apache.lucene.index.RandomIndexWriter;
> :  import org.apache.lucene.index.Term;
> :  import org.apache.lucene.search.BooleanClause;
> :  import org.apache.lucene.search.BooleanQuery;
> :  import org.apache.lucene.search.DocIdSetIterator;
> : +import org.apache.lucene.search.FuzzyQuery;
> :  import org.apache.lucene.search.IndexSearcher;
> :  import org.apache.lucene.search.PhraseQuery;
> : +import org.apache.lucene.search.PrefixQuery;
> :  import org.apache.lucene.search.Query;
> :  import org.apache.lucene.search.ScoreDoc;
> :  import org.apache.lucene.search.Sort;
> : @@ -959,4 +963,275 @@ public class TestUnifiedHighlighter extends
> LuceneTestCase {
> :      ir.close();
> :    }
> :
> : +  private IndexReader indexSomeFields() throws IOException {
> : +    RandomIndexWriter iw = new RandomIndexWriter(random(), dir,
> indexAnalyzer);
> : +    FieldType ft = new FieldType();
> : +    ft.setIndexOptions(IndexOptions.NONE);
> : +    ft.setTokenized(false);
> : +    ft.setStored(true);
> : +    ft.freeze();
> : +
> : +    Field title = new Field("title", "", fieldType);
> : +    Field text = new Field("text", "", fieldType);
> : +    Field category = new Field("category", "", fieldType);
> : +
> : +    Document doc = new Document();
> : +    doc.add(title);
> : +    doc.add(text);
> : +    doc.add(category);
> : +    title.setStringValue("This is the title field.");
> : +    text.setStringValue("This is the text field. You can put some text
> if you want.");
> : +    category.setStringValue("This is the category field.");
> : +    iw.addDocument(doc);
> : +
> : +    IndexReader ir = iw.getReader();
> : +    iw.close();
> : +    return ir;
> : +  }
> : +
> : +  public void testFieldMatcherTermQuery() throws Exception {
> : +    IndexReader ir = indexSomeFields();
> : +    IndexSearcher searcher = newSearcher(ir);
> : +    UnifiedHighlighter highlighterNoFieldMatch = new
> UnifiedHighlighter(searcher, indexAnalyzer) {
> : +      @Override
> : +      protected Predicate<String> getFieldMatcher(String field) {
> : +        // requireFieldMatch=false
> : +        return (qf) -> true;
> : +      }
> : +    };
> : +    UnifiedHighlighter highlighterFieldMatch = new
> UnifiedHighlighter(searcher, indexAnalyzer);
> : +    BooleanQuery.Builder queryBuilder =
> : +        new BooleanQuery.Builder()
> : +            .add(new TermQuery(new Term("text", "some")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("text", "field")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("text", "this")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("title", "is")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("title", "this")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("category", "this")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("category", "some")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("category", "category")),
> BooleanClause.Occur.SHOULD);
> : +    Query query = queryBuilder.build();
> : +
> : +    // title
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("title",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the title <b>field</b>.",
> snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("title", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the title field.",
> snippets[0]);
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("title", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> is the title <b>field</b>.",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +
> : +    // text
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("text",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the text <b>field</b>. You
> can put <b>some</b> text if you want.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("text", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> is the text <b>field</b>. You can put
> <b>some</b> text if you want.", snippets[0]);
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("text", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the text field. ",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +
> : +    // category
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("category",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the <b>category</b>
> <b>field</b>.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("category", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> is the <b>category</b> field.",
> snippets[0]);
> : +
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("category", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the category field.",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +    ir.close();
> : +  }
> : +
> : +  public void testFieldMatcherMultiTermQuery() throws Exception {
> : +    IndexReader ir = indexSomeFields();
> : +    IndexSearcher searcher = newSearcher(ir);
> : +    UnifiedHighlighter highlighterNoFieldMatch = new
> UnifiedHighlighter(searcher, indexAnalyzer) {
> : +      @Override
> : +      protected Predicate<String> getFieldMatcher(String field) {
> : +        // requireFieldMatch=false
> : +        return (qf) -> true;
> : +      }
> : +    };
> : +    UnifiedHighlighter highlighterFieldMatch = new
> UnifiedHighlighter(searcher, indexAnalyzer);
> : +    BooleanQuery.Builder queryBuilder =
> : +        new BooleanQuery.Builder()
> : +            .add(new FuzzyQuery(new Term("text", "sime"), 1),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PrefixQuery(new Term("text", "fie")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PrefixQuery(new Term("text", "thi")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new TermQuery(new Term("title", "is")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PrefixQuery(new Term("title", "thi")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PrefixQuery(new Term("category", "thi")),
> BooleanClause.Occur.SHOULD)
> : +            .add(new FuzzyQuery(new Term("category", "sime"), 1),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PrefixQuery(new Term("category", "categ")),
> BooleanClause.Occur.SHOULD);
> : +    Query query = queryBuilder.build();
> : +
> : +    // title
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("title",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the title <b>field</b>.",
> snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("title", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the title field.",
> snippets[0]);
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("title", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> is the title <b>field</b>.",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +
> : +    // text
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("text",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the text <b>field</b>. You
> can put <b>some</b> text if you want.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("text", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> is the text <b>field</b>. You can put
> <b>some</b> text if you want.", snippets[0]);
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("text", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the text field. ",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +
> : +    // category
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("category",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the <b>category</b>
> <b>field</b>.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("category", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> is the <b>category</b> field.",
> snippets[0]);
> : +
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("category", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the category field.",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +    ir.close();
> : +  }
> : +
> : +  public void testFieldMatcherPhraseQuery() throws Exception {
> : +    IndexReader ir = indexSomeFields();
> : +    IndexSearcher searcher = newSearcher(ir);
> : +    UnifiedHighlighter highlighterNoFieldMatch = new
> UnifiedHighlighter(searcher, indexAnalyzer) {
> : +      @Override
> : +      protected Predicate<String> getFieldMatcher(String field) {
> : +        // requireFieldMatch=false
> : +        return (qf) -> true;
> : +      }
> : +    };
> : +    UnifiedHighlighter highlighterFieldMatch = new
> UnifiedHighlighter(searcher, indexAnalyzer);
> : +    BooleanQuery.Builder queryBuilder =
> : +        new BooleanQuery.Builder()
> : +            .add(new PhraseQuery("title", "this", "is", "the",
> "title"), BooleanClause.Occur.SHOULD)
> : +            .add(new PhraseQuery(2, "category", "this", "is", "the",
> "field"), BooleanClause.Occur.SHOULD)
> : +            .add(new PhraseQuery("text", "this", "is"),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PhraseQuery("category", "this", "is"),
> BooleanClause.Occur.SHOULD)
> : +            .add(new PhraseQuery(1, "text", "you", "can", "put",
> "text"), BooleanClause.Occur.SHOULD);
> : +    Query query = queryBuilder.build();
> : +
> : +    // title
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("title",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> <b>the</b> <b>title</b>
> <b>field</b>.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("title", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> <b>the</b> <b>title</b>
> field.", snippets[0]);
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("title", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the title field.",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +
> : +    // text
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("text",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> <b>the</b> <b>text</b>
> <b>field</b>. <b>You</b> <b>can</b> <b>put</b> some <b>text</b> if you
> want.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("text", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the <b>text</b> field.
> <b>You</b> <b>can</b> <b>put</b> some <b>text</b> if you want.",
> snippets[0]);
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "title".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("text", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("This is the text field. You can put some text if
> you want.", snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +
> : +    // category
> : +    {
> : +      TopDocs topDocs = searcher.search(query, 10, Sort.INDEXORDER);
> : +      assertEquals(1, topDocs.totalHits);
> : +      String[] snippets = highlighterNoFieldMatch.highlight("category",
> query, topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> <b>the</b> category
> <b>field</b>.", snippets[0]);
> : +
> : +      snippets = highlighterFieldMatch.highlight("category", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> <b>the</b> category
> <b>field</b>.", snippets[0]);
> : +
> : +
> : +      highlighterFieldMatch.setFieldMatcher((fq) -> "text".equals(fq));
> : +      snippets = highlighterFieldMatch.highlight("category", query,
> topDocs, 10);
> : +      assertEquals(1, snippets.length);
> : +      assertEquals("<b>This</b> <b>is</b> the category field.",
> snippets[0]);
> : +      highlighterFieldMatch.setFieldMatcher(null);
> : +    }
> : +    ir.close();
> : +  }
> :  }
> :
> :
> http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e7a7dbf/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
> : ----------------------------------------------------------------------
> : diff --git
> a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
> b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
> : index d150940..10757a5 100644
> : ---
> a/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
> : +++
> b/lucene/highlighter/src/test/org/apache/lucene/search/uhighlight/visibility/TestUnifiedHighlighterExtensibility.java
> : @@ -23,7 +23,6 @@ import java.util.Collections;
> :  import java.util.List;
> :  import java.util.Map;
> :  import java.util.Set;
> : -import java.util.SortedSet;
> :
> :  import org.apache.lucene.analysis.Analyzer;
> :  import org.apache.lucene.analysis.MockAnalyzer;
> : @@ -144,7 +143,7 @@ public class TestUnifiedHighlighterExtensibility
> extends LuceneTestCase {
> :        }
> :
> :        @Override
> : -      protected FieldHighlighter getFieldHighlighter(String field,
> Query query, SortedSet<Term> allTerms, int maxPassages) {
> : +      protected FieldHighlighter getFieldHighlighter(String field,
> Query query, Set<Term> allTerms, int maxPassages) {
> :          return super.getFieldHighlighter(field, query, allTerms,
> maxPassages);
> :        }
> :
> :
> :
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com