You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by rm...@apache.org on 2015/02/09 17:12:33 UTC

svn commit: r1658447 [2/2] - in /lucene/dev/trunk/lucene: classification/src/java/org/apache/lucene/classification/ classification/src/java/org/apache/lucene/classification/utils/ codecs/src/java/org/apache/lucene/codecs/blockterms/ codecs/src/java/org...

Copied: lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package-info.java (from r1658398, lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package.html)
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package-info.java?p2=lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package-info.java&p1=lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package.html&r1=1658398&r2=1658447&rev=1658447&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package.html (original)
+++ lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/package-info.java Mon Feb  9 16:12:32 2015
@@ -1,199 +1,200 @@
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements.  See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License.  You may obtain a copy of the License at
-
-     http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<html>
-<body>
-
-<p>This module enables search result grouping with Lucene, where hits
-with the same value in the specified single-valued group field are
-grouped together.  For example, if you group by the <code>author</code>
-field, then all documents with the same value in the <code>author</code>
-field fall into a single group.</p>
-
-<p>Grouping requires a number of inputs:</p>
-
-  <ul>
-    <li> <code>groupField</code>: this is the field used for grouping.
-      For example, if you use the <code>author</code> field then each
-      group has all books by the same author.  Documents that don't
-      have this field are grouped under a single group with
-      a <code>null</code> group value.
-
-    <li> <code>groupSort</code>: how the groups are sorted.  For sorting
-      purposes, each group is "represented" by the highest-sorted
-      document according to the <code>groupSort</code> within it.  For
-      example, if you specify "price" (ascending) then the first group
-      is the one with the lowest price book within it.  Or if you
-      specify relevance group sort, then the first group is the one
-      containing the highest scoring book.
-
-    <li> <code>topNGroups</code>: how many top groups to keep.  For
-      example, 10 means the top 10 groups are computed.
-
-    <li> <code>groupOffset</code>: which "slice" of top groups you want to
-      retrieve.  For example, 3 means you'll get 7 groups back
-      (assuming <code>topNGroups</code> is 10).  This is useful for
-      paging, where you might show 5 groups per page.
-
-    <li> <code>withinGroupSort</code>: how the documents within each group
-      are sorted.  This can be different from the group sort.
-
-    <li> <code>maxDocsPerGroup</code>: how many top documents within each
-      group to keep.
-
-    <li> <code>withinGroupOffset</code>: which "slice" of top
-      documents you want to retrieve from each group.
-
-  </ul>
-
-<p>The implementation is two-pass: the first pass ({@link
-  org.apache.lucene.search.grouping.term.TermFirstPassGroupingCollector})
-  gathers the top groups, and the second pass ({@link
-  org.apache.lucene.search.grouping.term.TermSecondPassGroupingCollector})
-  gathers documents within those groups.  If the search is costly to
-  run you may want to use the {@link
-  org.apache.lucene.search.CachingCollector} class, which
-  caches hits and can (quickly) replay them for the second pass.  This
-  way you only run the query once, but you pay a RAM cost to (briefly)
-  hold all hits.  Results are returned as a {@link
-  org.apache.lucene.search.grouping.TopGroups} instance.</p>
-
-<p>
-  This module abstracts away what defines group and how it is collected. All grouping collectors
-  are abstract and have currently term based implementations. One can implement
-  collectors that for example group on multiple fields.
-</p>
-
-<p>Known limitations:</p>
-<ul>
-  <li> For the two-pass grouping search, the group field must be a
-    indexed as a {@link org.apache.lucene.document.SortedDocValuesField}).
-  <li> Although Solr support grouping by function and this module has abstraction of what a group is, there are currently only
-    implementations for grouping based on terms.
-  <li> Sharding is not directly supported, though is not too
-    difficult, if you can merge the top groups and top documents per
-    group yourself.
-</ul>
-
-<p>Typical usage for the generic two-pass grouping search looks like this using the grouping convenience utility
-  (optionally using caching for the second pass search):</p>
-
-<pre class="prettyprint">
-  GroupingSearch groupingSearch = new GroupingSearch("author");
-  groupingSearch.setGroupSort(groupSort);
-  groupingSearch.setFillSortFields(fillFields);
-
-  if (useCache) {
-    // Sets cache in MB
-    groupingSearch.setCachingInMB(4.0, true);
-  }
-
-  if (requiredTotalGroupCount) {
-    groupingSearch.setAllGroups(true);
-  }
-
-  TermQuery query = new TermQuery(new Term("content", searchTerm));
-  TopGroups&lt;BytesRef&gt; result = groupingSearch.search(indexSearcher, query, groupOffset, groupLimit);
-
-  // Render groupsResult...
-  if (requiredTotalGroupCount) {
-    int totalGroupCount = result.totalGroupCount;
-  }
-</pre>
-
-<p>To use the single-pass <code>BlockGroupingCollector</code>,
-   first, at indexing time, you must ensure all docs in each group
-   are added as a block, and you have some way to find the last
-   document of each group.  One simple way to do this is to add a
-   marker binary field:</p>
-
-<pre class="prettyprint">
-  // Create Documents from your source:
-  List&lt;Document&gt; oneGroup = ...;
-  
-  Field groupEndField = new Field("groupEnd", "x", Field.Store.NO, Field.Index.NOT_ANALYZED);
-  groupEndField.setIndexOptions(IndexOptions.DOCS_ONLY);
-  groupEndField.setOmitNorms(true);
-  oneGroup.get(oneGroup.size()-1).add(groupEndField);
-
-  // You can also use writer.updateDocuments(); just be sure you
-  // replace an entire previous doc block with this new one.  For
-  // example, each group could have a "groupID" field, with the same
-  // value for all docs in this group:
-  writer.addDocuments(oneGroup);
-</pre>
-
-Then, at search time, do this up front:
-
-<pre class="prettyprint">
-  // Set this once in your app & save away for reusing across all queries:
-  Filter groupEndDocs = new CachingWrapperFilter(new QueryWrapperFilter(new TermQuery(new Term("groupEnd", "x"))));
-</pre>
-
-Finally, do this per search:
-
-<pre class="prettyprint">
-  // Per search:
-  BlockGroupingCollector c = new BlockGroupingCollector(groupSort, groupOffset+topNGroups, needsScores, groupEndDocs);
-  s.search(new TermQuery(new Term("content", searchTerm)), c);
-  TopGroups groupsResult = c.getTopGroups(withinGroupSort, groupOffset, docOffset, docOffset+docsPerGroup, fillFields);
-
-  // Render groupsResult...
-</pre>
-
-Or alternatively use the <code>GroupingSearch</code> convenience utility:
-
-<pre class="prettyprint">
-  // Per search:
-  GroupingSearch groupingSearch = new GroupingSearch(groupEndDocs);
-  groupingSearch.setGroupSort(groupSort);
-  groupingSearch.setIncludeScores(needsScores);
-  TermQuery query = new TermQuery(new Term("content", searchTerm));
-  TopGroups groupsResult = groupingSearch.search(indexSearcher, query, groupOffset, groupLimit);
-
-  // Render groupsResult...
-</pre>
-
-Note that the <code>groupValue</code> of each <code>GroupDocs</code>
-will be <code>null</code>, so if you need to present this value you'll
-have to separately retrieve it (for example using stored
-fields, <code>FieldCache</code>, etc.).
-
-<p>Another collector is the <code>TermAllGroupHeadsCollector</code> that can be used to retrieve all most relevant
-   documents per group. Also known as group heads. This can be useful in situations when one wants to compute group
-   based facets / statistics on the complete query result. The collector can be executed during the first or second
-   phase. This collector can also be used with the <code>GroupingSearch</code> convenience utility, but when if one only
-   wants to compute the most relevant documents per group it is better to just use the collector as done here below.</p>
-
-<pre class="prettyprint">
-  AbstractAllGroupHeadsCollector c = TermAllGroupHeadsCollector.create(groupField, sortWithinGroup);
-  s.search(new TermQuery(new Term("content", searchTerm)), c);
-  // Return all group heads as int array
-  int[] groupHeadsArray = c.retrieveGroupHeads()
-  // Return all group heads as FixedBitSet.
-  int maxDoc = s.maxDoc();
-  FixedBitSet groupHeadsBitSet = c.retrieveGroupHeads(maxDoc)
-</pre>
-
-<p>For each of the above collector types there is also a variant that works with <code>ValueSource</code> instead of
-   of fields. Concretely this means that these variants can work with functions. These variants are slower than
-   there term based counter parts. These implementations are located in the
-   <code>org.apache.lucene.search.grouping.function</code> package, but can also be used with the
-  <code>GroupingSearch</code> convenience utility
-</p>
-
-</body>
-</html>
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/** 
+ * Grouping.
+ * <p>
+ * This module enables search result grouping with Lucene, where hits
+ * with the same value in the specified single-valued group field are
+ * grouped together.  For example, if you group by the <code>author</code>
+ * field, then all documents with the same value in the <code>author</code>
+ * field fall into a single group.
+ * </p>
+ * 
+ * <p>Grouping requires a number of inputs:</p>
+ * 
+ * <ul>
+ *   <li><code>groupField</code>: this is the field used for grouping.
+ *       For example, if you use the <code>author</code> field then each
+ *       group has all books by the same author.  Documents that don't
+ *       have this field are grouped under a single group with
+ *       a <code>null</code> group value.
+ * 
+ *   <li><code>groupSort</code>: how the groups are sorted.  For sorting
+ *       purposes, each group is "represented" by the highest-sorted
+ *       document according to the <code>groupSort</code> within it.  For
+ *       example, if you specify "price" (ascending) then the first group
+ *       is the one with the lowest price book within it.  Or if you
+ *       specify relevance group sort, then the first group is the one
+ *       containing the highest scoring book.
+ * 
+ *   <li><code>topNGroups</code>: how many top groups to keep.  For
+ *       example, 10 means the top 10 groups are computed.
+ * 
+ *   <li><code>groupOffset</code>: which "slice" of top groups you want to
+ *       retrieve.  For example, 3 means you'll get 7 groups back
+ *       (assuming <code>topNGroups</code> is 10).  This is useful for
+ *       paging, where you might show 5 groups per page.
+ * 
+ *   <li><code>withinGroupSort</code>: how the documents within each group
+ *       are sorted.  This can be different from the group sort.
+ * 
+ *   <li><code>maxDocsPerGroup</code>: how many top documents within each
+ *       group to keep.
+ * 
+ *   <li><code>withinGroupOffset</code>: which "slice" of top
+ *       documents you want to retrieve from each group.
+ * 
+ * </ul>
+ * 
+ * <p>The implementation is two-pass: the first pass ({@link
+ *   org.apache.lucene.search.grouping.term.TermFirstPassGroupingCollector})
+ *   gathers the top groups, and the second pass ({@link
+ *   org.apache.lucene.search.grouping.term.TermSecondPassGroupingCollector})
+ *   gathers documents within those groups.  If the search is costly to
+ *   run you may want to use the {@link
+ *   org.apache.lucene.search.CachingCollector} class, which
+ *   caches hits and can (quickly) replay them for the second pass.  This
+ *   way you only run the query once, but you pay a RAM cost to (briefly)
+ *   hold all hits.  Results are returned as a {@link
+ *   org.apache.lucene.search.grouping.TopGroups} instance.</p>
+ * 
+ * <p>
+ *   This module abstracts away what defines group and how it is collected. All grouping collectors
+ *   are abstract and have currently term based implementations. One can implement
+ *   collectors that for example group on multiple fields.
+ * </p>
+ * 
+ * <p>Known limitations:</p>
+ * <ul>
+ *   <li> For the two-pass grouping search, the group field must be a
+ *     indexed as a {@link org.apache.lucene.document.SortedDocValuesField}).
+ *   <li> Although Solr support grouping by function and this module has abstraction of what a group is, there are currently only
+ *     implementations for grouping based on terms.
+ *   <li> Sharding is not directly supported, though is not too
+ *     difficult, if you can merge the top groups and top documents per
+ *     group yourself.
+ * </ul>
+ * 
+ * <p>Typical usage for the generic two-pass grouping search looks like this using the grouping convenience utility
+ *   (optionally using caching for the second pass search):</p>
+ * 
+ * <pre class="prettyprint">
+ *   GroupingSearch groupingSearch = new GroupingSearch("author");
+ *   groupingSearch.setGroupSort(groupSort);
+ *   groupingSearch.setFillSortFields(fillFields);
+ * 
+ *   if (useCache) {
+ *     // Sets cache in MB
+ *     groupingSearch.setCachingInMB(4.0, true);
+ *   }
+ * 
+ *   if (requiredTotalGroupCount) {
+ *     groupingSearch.setAllGroups(true);
+ *   }
+ * 
+ *   TermQuery query = new TermQuery(new Term("content", searchTerm));
+ *   TopGroups&lt;BytesRef&gt; result = groupingSearch.search(indexSearcher, query, groupOffset, groupLimit);
+ * 
+ *   // Render groupsResult...
+ *   if (requiredTotalGroupCount) {
+ *     int totalGroupCount = result.totalGroupCount;
+ *   }
+ * </pre>
+ * 
+ * <p>To use the single-pass <code>BlockGroupingCollector</code>,
+ *    first, at indexing time, you must ensure all docs in each group
+ *    are added as a block, and you have some way to find the last
+ *    document of each group.  One simple way to do this is to add a
+ *    marker binary field:</p>
+ * 
+ * <pre class="prettyprint">
+ *   // Create Documents from your source:
+ *   List&lt;Document&gt; oneGroup = ...;
+ *   
+ *   Field groupEndField = new Field("groupEnd", "x", Field.Store.NO, Field.Index.NOT_ANALYZED);
+ *   groupEndField.setIndexOptions(IndexOptions.DOCS_ONLY);
+ *   groupEndField.setOmitNorms(true);
+ *   oneGroup.get(oneGroup.size()-1).add(groupEndField);
+ * 
+ *   // You can also use writer.updateDocuments(); just be sure you
+ *   // replace an entire previous doc block with this new one.  For
+ *   // example, each group could have a "groupID" field, with the same
+ *   // value for all docs in this group:
+ *   writer.addDocuments(oneGroup);
+ * </pre>
+ * 
+ * Then, at search time, do this up front:
+ * 
+ * <pre class="prettyprint">
+ *   // Set this once in your app &amp; save away for reusing across all queries:
+ *   Filter groupEndDocs = new CachingWrapperFilter(new QueryWrapperFilter(new TermQuery(new Term("groupEnd", "x"))));
+ * </pre>
+ * 
+ * Finally, do this per search:
+ * 
+ * <pre class="prettyprint">
+ *   // Per search:
+ *   BlockGroupingCollector c = new BlockGroupingCollector(groupSort, groupOffset+topNGroups, needsScores, groupEndDocs);
+ *   s.search(new TermQuery(new Term("content", searchTerm)), c);
+ *   TopGroups groupsResult = c.getTopGroups(withinGroupSort, groupOffset, docOffset, docOffset+docsPerGroup, fillFields);
+ * 
+ *   // Render groupsResult...
+ * </pre>
+ * 
+ * Or alternatively use the <code>GroupingSearch</code> convenience utility:
+ * 
+ * <pre class="prettyprint">
+ *   // Per search:
+ *   GroupingSearch groupingSearch = new GroupingSearch(groupEndDocs);
+ *   groupingSearch.setGroupSort(groupSort);
+ *   groupingSearch.setIncludeScores(needsScores);
+ *   TermQuery query = new TermQuery(new Term("content", searchTerm));
+ *   TopGroups groupsResult = groupingSearch.search(indexSearcher, query, groupOffset, groupLimit);
+ *
+ *   // Render groupsResult...
+ * </pre>
+ * 
+ * Note that the <code>groupValue</code> of each <code>GroupDocs</code>
+ * will be <code>null</code>, so if you need to present this value you'll
+ * have to separately retrieve it (for example using stored
+ * fields, <code>FieldCache</code>, etc.).
+ * 
+ * <p>Another collector is the <code>TermAllGroupHeadsCollector</code> that can be used to retrieve all most relevant
+ *    documents per group. Also known as group heads. This can be useful in situations when one wants to compute group
+ *    based facets / statistics on the complete query result. The collector can be executed during the first or second
+ *    phase. This collector can also be used with the <code>GroupingSearch</code> convenience utility, but when if one only
+ *    wants to compute the most relevant documents per group it is better to just use the collector as done here below.</p>
+ * 
+ * <pre class="prettyprint">
+ *   AbstractAllGroupHeadsCollector c = TermAllGroupHeadsCollector.create(groupField, sortWithinGroup);
+ *   s.search(new TermQuery(new Term("content", searchTerm)), c);
+ *   // Return all group heads as int array
+ *   int[] groupHeadsArray = c.retrieveGroupHeads()
+ *   // Return all group heads as FixedBitSet.
+ *   int maxDoc = s.maxDoc();
+ *   FixedBitSet groupHeadsBitSet = c.retrieveGroupHeads(maxDoc)
+ * </pre>
+ * 
+ * <p>For each of the above collector types there is also a variant that works with <code>ValueSource</code> instead of
+ *    of fields. Concretely this means that these variants can work with functions. These variants are slower than
+ *    there term based counter parts. These implementations are located in the
+ *    <code>org.apache.lucene.search.grouping.function</code> package, but can also be used with the
+ *   <code>GroupingSearch</code> convenience utility
+ * </p>
+ */
+package org.apache.lucene.search.grouping;

Copied: lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package-info.java (from r1658398, lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package.html)
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package-info.java?p2=lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package-info.java&p1=lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package.html&r1=1658398&r2=1658447&rev=1658447&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package.html (original)
+++ lucene/dev/trunk/lucene/grouping/src/java/org/apache/lucene/search/grouping/term/package-info.java Mon Feb  9 16:12:32 2015
@@ -1,21 +1,21 @@
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements.  See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License.  You may obtain a copy of the License at
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
 
-     http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<html>
-<body>
-Support for grouping by indexed terms via {@link org.apache.lucene.index.DocValues}.
-</body>
-</html>
+/**
+ * Support for grouping by indexed terms via {@link org.apache.lucene.index.DocValues}.
+ */
+package org.apache.lucene.search.grouping.term;

Copied: lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package-info.java (from r1658398, lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package.html)
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package-info.java?p2=lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package-info.java&p1=lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package.html&r1=1658398&r2=1658447&rev=1658447&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package.html (original)
+++ lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/highlight/package-info.java Mon Feb  9 16:12:32 2015
@@ -1,99 +1,95 @@
-<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements.  See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License.  You may obtain a copy of the License at
-
-     http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<html>
-<body>
-
-The highlight package contains classes to provide "keyword in context" features
-typically used to highlight search terms in the text of results pages.
-The Highlighter class is the central component and can be used to extract the
-most interesting sections of a piece of text and highlight them, with the help of
-Fragmenter, fragment Scorer, and Formatter classes.
-
-<h2>Example Usage</h2>
-
-<pre class="prettyprint">
-  //... Above, create documents with two fields, one with term vectors (tv) and one without (notv)
-  IndexSearcher searcher = new IndexSearcher(directory);
-  QueryParser parser = new QueryParser("notv", analyzer);
-  Query query = parser.parse("million");
-
-  TopDocs hits = searcher.search(query, 10);
-
-  SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
-  Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
-  for (int i = 0; i < 10; i++) {
-    int id = hits.scoreDocs[i].doc;
-    Document doc = searcher.doc(id);
-    String text = doc.get("notv");
-    TokenStream tokenStream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), id, "notv", analyzer);
-    TextFragment[] frag = highlighter.getBestTextFragments(tokenStream, text, false, 10);//highlighter.getBestFragments(tokenStream, text, 3, "...");
-    for (int j = 0; j < frag.length; j++) {
-      if ((frag[j] != null) && (frag[j].getScore() > 0)) {
-        System.out.println((frag[j].toString()));
-      }
-    }
-    //Term vector
-    text = doc.get("tv");
-    tokenStream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), hits.scoreDocs[i].doc, "tv", analyzer);
-    frag = highlighter.getBestTextFragments(tokenStream, text, false, 10);
-    for (int j = 0; j < frag.length; j++) {
-      if ((frag[j] != null) && (frag[j].getScore() > 0)) {
-        System.out.println((frag[j].toString()));
-      }
-    }
-    System.out.println("-------------");
-  }
-</pre>
-
-<h2>New features 06/02/2005</h2>
-
-This release adds options for encoding (thanks to Nicko Cadell).
-An "Encoder" implementation such as the new SimpleHTMLEncoder class can be passed to the highlighter to encode
-all those non-xhtml standard characters such as &amp; into legal values. This simple class may not suffice for
-some languages -  Commons Lang has an implementation that could be used: escapeHtml(String) in
-http://svn.apache.org/viewcvs.cgi/jakarta/commons/proper/lang/trunk/src/java/org/apache/commons/lang/StringEscapeUtils.java?rev=137958&view=markup
-
-<h2>New features 22/12/2004</h2>
-
-This release adds some new capabilities:
-<ol>
-	<li>Faster highlighting using Term vector support</li>
-	<li>New formatting options to use color intensity to show informational value</li>
-	<li>Options for better summarization by using term IDF scores to influence fragment selection</li>
-</ol>
-
-<p>
-The highlighter takes a TokenStream as input. Until now these streams have typically been produced
-using an Analyzer but the new class TokenSources provides helper methods for obtaining TokenStreams from
-the new TermVector position support (see latest CVS version).</p>
-
-<p>The new class GradientFormatter can use a scale of colors to highlight terms according to their score.
-A subtle use of color can help emphasise the reasons for matching (useful when doing "MoreLikeThis" queries and
-you want to see what the basis of the similarities are).</p>
-
-<p>The QueryScorer class has a new constructor which can use an IndexReader to derive the IDF (inverse document frequency)
-for each term in order to influence the score. This is useful for helping to extracting the most significant sections
-of a document and in supplying scores used by the new GradientFormatter to color significant words more strongly.
-The QueryScorer.getMaxWeight method is useful when passed to the GradientFormatter constructor to define the top score
-which is associated with the top color.</p>
-
-
-
-
-</body>
-</html>
\ No newline at end of file
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Highlighting search terms.
+ * <p>
+ * The highlight package contains classes to provide "keyword in context" features
+ * typically used to highlight search terms in the text of results pages.
+ * The Highlighter class is the central component and can be used to extract the
+ * most interesting sections of a piece of text and highlight them, with the help of
+ * Fragmenter, fragment Scorer, and Formatter classes.
+ * 
+ * <h2>Example Usage</h2>
+ *
+ * <pre class="prettyprint">
+ * //... Above, create documents with two fields, one with term vectors (tv) and one without (notv)
+ * IndexSearcher searcher = new IndexSearcher(directory);
+ * QueryParser parser = new QueryParser("notv", analyzer);
+ * Query query = parser.parse("million");
+ * 
+ *   TopDocs hits = searcher.search(query, 10);
+ * 
+ *   SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
+ *   Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
+ *   for (int i = 0; i &lt; 10; i++) {
+ *     int id = hits.scoreDocs[i].doc;
+ *     Document doc = searcher.doc(id);
+ *     String text = doc.get("notv");
+ *     TokenStream tokenStream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), id, "notv", analyzer);
+ *     TextFragment[] frag = highlighter.getBestTextFragments(tokenStream, text, false, 10);//highlighter.getBestFragments(tokenStream, text, 3, "...");
+ *     for (int j = 0; j &lt; frag.length; j++) {
+ *       if ((frag[j] != null) &amp;&amp; (frag[j].getScore() &gt; 0)) {
+ *         System.out.println((frag[j].toString()));
+ *       }
+ *     }
+ *     //Term vector
+ *     text = doc.get("tv");
+ *     tokenStream = TokenSources.getAnyTokenStream(searcher.getIndexReader(), hits.scoreDocs[i].doc, "tv", analyzer);
+ *     frag = highlighter.getBestTextFragments(tokenStream, text, false, 10);
+ *     for (int j = 0; j &lt; frag.length; j++) {
+ *       if ((frag[j] != null) &amp;&amp; (frag[j].getScore() &gt; 0)) {
+ *         System.out.println((frag[j].toString()));
+ *       }
+ *     }
+ *     System.out.println("-------------");
+ *   }
+ * </pre>
+ * 
+ * <h2>New features 06/02/2005</h2>
+ * 
+ * This release adds options for encoding (thanks to Nicko Cadell).
+ * An "Encoder" implementation such as the new SimpleHTMLEncoder class can be passed to the highlighter to encode
+ * all those non-xhtml standard characters such as &amp; into legal values. This simple class may not suffice for
+ * some languages -  Commons Lang has an implementation that could be used: escapeHtml(String) in
+ * http://svn.apache.org/viewcvs.cgi/jakarta/commons/proper/lang/trunk/src/java/org/apache/commons/lang/StringEscapeUtils.java?rev=137958&amp;view=markup
+ * 
+ * <h2>New features 22/12/2004</h2>
+ * 
+ * This release adds some new capabilities:
+ * <ol>
+ *   <li>Faster highlighting using Term vector support</li>
+ *   <li>New formatting options to use color intensity to show informational value</li>
+ *   <li>Options for better summarization by using term IDF scores to influence fragment selection</li>
+ * </ol>
+ * 
+ * <p>
+ * The highlighter takes a TokenStream as input. Until now these streams have typically been produced
+ * using an Analyzer but the new class TokenSources provides helper methods for obtaining TokenStreams from
+ * the new TermVector position support (see latest CVS version).</p>
+ * 
+ * <p>The new class GradientFormatter can use a scale of colors to highlight terms according to their score.
+ * A subtle use of color can help emphasise the reasons for matching (useful when doing "MoreLikeThis" queries and
+ * you want to see what the basis of the similarities are).</p>
+ * 
+ * <p>The QueryScorer class has a new constructor which can use an IndexReader to derive the IDF (inverse document frequency)
+ * for each term in order to influence the score. This is useful for helping to extracting the most significant sections
+ * of a document and in supplying scores used by the new GradientFormatter to color significant words more strongly.
+ * The QueryScorer.getMaxWeight method is useful when passed to the GradientFormatter constructor to define the top score
+ * which is associated with the top color.</p>
+ */
+package org.apache.lucene.search.highlight;

Copied: lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package-info.java (from r1658398, lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package.html)
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package-info.java?p2=lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package-info.java&p1=lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package.html&r1=1658398&r2=1658447&rev=1658447&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package.html (original)
+++ lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/postingshighlight/package-info.java Mon Feb  9 16:12:32 2015
@@ -1,22 +1,21 @@
-<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements.  See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License.  You may obtain a copy of the License at
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
 
-     http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<html>
-<body>
-Highlighter implementation that uses offsets from postings lists.
-</body>
-</html>
\ No newline at end of file
+/**
+ * Highlighter implementation that uses offsets from postings lists.
+ */
+package org.apache.lucene.search.postingshighlight;

Copied: lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package-info.java (from r1658398, lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package.html)
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package-info.java?p2=lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package-info.java&p1=lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package.html&r1=1658398&r2=1658447&rev=1658447&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package.html (original)
+++ lucene/dev/trunk/lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/package-info.java Mon Feb  9 16:12:32 2015
@@ -1,196 +1,194 @@
-<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements.  See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License.  You may obtain a copy of the License at
-
-     http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<html>
-<body>
-This is an another highlighter implementation.
-
-<h2>Features</h2>
-<ul>
-<li>fast for large docs</li>
-<li>support N-gram fields</li>
-<li>support phrase-unit highlighting with slops</li>
-<li>support multi-term (includes wildcard, range, regexp, etc) queries</li>
-<li>need Java 1.5</li>
-<li>highlight fields need to be stored with Positions and Offsets</li>
-<li>take into account query boost and/or IDF-weight to score fragments</li>
-<li>support colored highlight tags</li>
-<li>pluggable FragListBuilder / FieldFragList</li>
-<li>pluggable FragmentsBuilder</li>
-</ul>
-
-<h2>Algorithm</h2>
-<p>To explain the algorithm, let's use the following sample text
- (to be highlighted) and user query:</p>
-
-<table border=1>
-<tr>
-<td><b>Sample Text</b></td>
-<td>Lucene is a search engine library.</td>
-</tr>
-<tr>
-<td><b>User Query</b></td>
-<td>Lucene^2 OR "search library"~1</td>
-</tr>
-</table>
-
-<p>The user query is a BooleanQuery that consists of TermQuery("Lucene") 
-with boost of 2 and PhraseQuery("search library") with slop of 1.</p>
-<p>For your convenience, here is the offsets and positions info of the 
-sample text.</p>
-
-<pre>
-+--------+-----------------------------------+
-|        |          1111111111222222222233333|
-|  offset|01234567890123456789012345678901234|
-+--------+-----------------------------------+
-|document|Lucene is a search engine library. |
-+--------*-----------------------------------+
-|position|0      1  2 3      4      5        |
-+--------*-----------------------------------+
-</pre>
-
-<h3>Step 1.</h3>
-<p>In Step 1, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldQuery.QueryPhraseMap} from the user query.
-<code>QueryPhraseMap</code> consists of the following members:</p>
-<pre class="prettyprint">
-public class QueryPhraseMap {
-  boolean terminal;
-  int slop;   // valid if terminal == true and phraseHighlight == true
-  float boost;  // valid if terminal == true
-  Map&lt;String, QueryPhraseMap&gt; subMap;
-} 
-</pre>
-<p><code>QueryPhraseMap</code> has subMap. The key of the subMap is a term 
-text in the user query and the value is a subsequent <code>QueryPhraseMap</code>.
-If the query is a term (not phrase), then the subsequent <code>QueryPhraseMap</code>
-is marked as terminal. If the query is a phrase, then the subsequent <code>QueryPhraseMap</code>
-is not a terminal and it has the next term text in the phrase.</p>
-
-<p>From the sample user query, the following <code>QueryPhraseMap</code> 
-will be generated:</p>
-<pre>
-   QueryPhraseMap
-+--------+-+  +-------+-+
-|"Lucene"|o+->|boost=2|*|  * : terminal
-+--------+-+  +-------+-+
-
-+--------+-+  +---------+-+  +-------+------+-+
-|"search"|o+->|"library"|o+->|boost=1|slop=1|*|
-+--------+-+  +---------+-+  +-------+------+-+
-</pre>
-
-<h3>Step 2.</h3>
-<p>In Step 2, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldTermStack}. Fast Vector Highlighter uses term vector data
-(must be stored {@link org.apache.lucene.document.FieldType#setStoreTermVectorOffsets(boolean)} and {@link org.apache.lucene.document.FieldType#setStoreTermVectorPositions(boolean)})
-to generate it. <code>FieldTermStack</code> keeps the terms in the user query.
-Therefore, in this sample case, Fast Vector Highlighter generates the following <code>FieldTermStack</code>:</p>
-<pre>
-   FieldTermStack
-+------------------+
-|"Lucene"(0,6,0)   |
-+------------------+
-|"search"(12,18,3) |
-+------------------+
-|"library"(26,33,5)|
-+------------------+
-where : "termText"(startOffset,endOffset,position)
-</pre>
-<h3>Step 3.</h3>
-<p>In Step 3, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldPhraseList}
-by reference to <code>QueryPhraseMap</code> and <code>FieldTermStack</code>.</p>
-<pre>
-   FieldPhraseList
-+----------------+-----------------+---+
-|"Lucene"        |[(0,6)]          |w=2|
-+----------------+-----------------+---+
-|"search library"|[(12,18),(26,33)]|w=1|
-+----------------+-----------------+---+
-</pre>
-<p>The type of each entry is <code>WeightedPhraseInfo</code> that consists of
-an array of terms offsets and weight. 
-</p>
-<h3>Step 4.</h3>
-<p>In Step 4, Fast Vector Highlighter creates <code>FieldFragList</code> by reference to
-<code>FieldPhraseList</code>. In this sample case, the following
-<code>FieldFragList</code> will be generated:</p>
-<pre>
-   FieldFragList
-+---------------------------------+
-|"Lucene"[(0,6)]                  |
-|"search library"[(12,18),(26,33)]|
-|totalBoost=3                     |
-+---------------------------------+
-</pre>
-
-<p>
-The calculation for each <code>FieldFragList.WeightedFragInfo.totalBoost</code> (weight)  
-depends on the implementation of <code>FieldFragList.add( ... )</code>:
-<pre class="prettyprint">
-  public void add( int startOffset, int endOffset, List&lt;WeightedPhraseInfo&gt; phraseInfoList ) {
-    float totalBoost = 0;
-    List&lt;SubInfo&gt; subInfos = new ArrayList&lt;SubInfo&gt;();
-    for( WeightedPhraseInfo phraseInfo : phraseInfoList ){
-      subInfos.add( new SubInfo( phraseInfo.getText(), phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) );
-      totalBoost += phraseInfo.getBoost();
-    }
-    getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, totalBoost ) );
-  }
-  
-</pre>
-The used implementation of <code>FieldFragList</code> is noted in <code>BaseFragListBuilder.createFieldFragList( ... )</code>:
-<pre class="prettyprint">
-  public FieldFragList createFieldFragList( FieldPhraseList fieldPhraseList, int fragCharSize ){
-    return createFieldFragList( fieldPhraseList, new SimpleFieldFragList( fragCharSize ), fragCharSize );
-  }
-</pre>
-<p>
-Currently there are basically to approaches available:
-</p>
-<ul>
-<li><code>SimpleFragListBuilder using SimpleFieldFragList</code>: <i>sum-of-boosts</i>-approach. The totalBoost is calculated by summarizing the query-boosts per term. Per default a term is boosted by 1.0</li>
-<li><code>WeightedFragListBuilder using WeightedFieldFragList</code>: <i>sum-of-distinct-weights</i>-approach. The totalBoost is calculated by summarizing the IDF-weights of distinct terms.</li>
-</ul> 
-<p>Comparison of the two approaches:</p>
-<table border="1">
-<caption>
-	query = das alte testament (The Old Testament)
-</caption>
-<tr><th>Terms in fragment</th><th>sum-of-distinct-weights</th><th>sum-of-boosts</th></tr>
-<tr><td>das alte testament</td><td>5.339621</td><td>3.0</td></tr>
-<tr><td>das alte testament</td><td>5.339621</td><td>3.0</td></tr>
-<tr><td>das testament alte</td><td>5.339621</td><td>3.0</td></tr>
-<tr><td>das alte testament</td><td>5.339621</td><td>3.0</td></tr>
-<tr><td>das testament</td><td>2.9455688</td><td>2.0</td></tr>
-<tr><td>das alte</td><td>2.4759595</td><td>2.0</td></tr>
-<tr><td>das das das das</td><td>1.5015357</td><td>4.0</td></tr>
-<tr><td>das das das</td><td>1.3003681</td><td>3.0</td></tr>
-<tr><td>das das</td><td>1.061746</td><td>2.0</td></tr>
-<tr><td>alte</td><td>1.0</td><td>1.0</td></tr>
-<tr><td>alte</td><td>1.0</td><td>1.0</td></tr>
-<tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
-<tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
-<tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
-<tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
-<tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
-</table>
-
-<h3>Step 5.</h3>
-<p>In Step 5, by using <code>FieldFragList</code> and the field stored data,
-Fast Vector Highlighter creates highlighted snippets!</p>
-</body>
-</html>
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Another highlighter implementation based on term vectors.
+ * 
+ * <h2>Features</h2>
+ * <ul>
+ * <li>fast for large docs</li>
+ * <li>support N-gram fields</li>
+ * <li>support phrase-unit highlighting with slops</li>
+ * <li>support multi-term (includes wildcard, range, regexp, etc) queries</li>
+ * <li>highlight fields need to be stored with Positions and Offsets</li>
+ * <li>take into account query boost and/or IDF-weight to score fragments</li>
+ * <li>support colored highlight tags</li>
+ * <li>pluggable FragListBuilder / FieldFragList</li>
+ * <li>pluggable FragmentsBuilder</li>
+ * </ul>
+ * 
+ * <h2>Algorithm</h2>
+ * <p>To explain the algorithm, let's use the following sample text
+ *  (to be highlighted) and user query:</p>
+ * 
+ * <table border=1 summary="sample document and query">
+ * <tr>
+ * <td><b>Sample Text</b></td>
+ * <td>Lucene is a search engine library.</td>
+ * </tr>
+ * <tr>
+ * <td><b>User Query</b></td>
+ * <td>Lucene^2 OR "search library"~1</td>
+ * </tr>
+ * </table>
+ * 
+ * <p>The user query is a BooleanQuery that consists of TermQuery("Lucene") 
+ * with boost of 2 and PhraseQuery("search library") with slop of 1.</p>
+ * <p>For your convenience, here is the offsets and positions info of the 
+ * sample text.</p>
+ * 
+ * <pre>
+ * +--------+-----------------------------------+
+ * |        |          1111111111222222222233333|
+ * |  offset|01234567890123456789012345678901234|
+ * +--------+-----------------------------------+
+ * |document|Lucene is a search engine library. |
+ * +--------*-----------------------------------+
+ * |position|0      1  2 3      4      5        |
+ * +--------*-----------------------------------+
+ * </pre>
+ * 
+ * <h3>Step 1.</h3>
+ * <p>In Step 1, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldQuery.QueryPhraseMap} from the user query.
+ * <code>QueryPhraseMap</code> consists of the following members:</p>
+ * <pre class="prettyprint">
+ * public class QueryPhraseMap {
+ *   boolean terminal;
+ *   int slop;   // valid if terminal == true and phraseHighlight == true
+ *   float boost;  // valid if terminal == true
+ *   Map&lt;String, QueryPhraseMap&gt; subMap;
+ * } 
+ * </pre>
+ * <p><code>QueryPhraseMap</code> has subMap. The key of the subMap is a term 
+ * text in the user query and the value is a subsequent <code>QueryPhraseMap</code>.
+ * If the query is a term (not phrase), then the subsequent <code>QueryPhraseMap</code>
+ * is marked as terminal. If the query is a phrase, then the subsequent <code>QueryPhraseMap</code>
+ * is not a terminal and it has the next term text in the phrase.</p>
+ * 
+ * <p>From the sample user query, the following <code>QueryPhraseMap</code> 
+ * will be generated:</p>
+ * <pre>
+ * QueryPhraseMap
+ * +--------+-+  +-------+-+
+ * |"Lucene"|o+-&gt;|boost=2|*|  * : terminal
+ * +--------+-+  +-------+-+
+ * 
+ * +--------+-+  +---------+-+  +-------+------+-+
+ * |"search"|o+-&gt;|"library"|o+-&gt;|boost=1|slop=1|*|
+ * +--------+-+  +---------+-+  +-------+------+-+
+ * </pre>
+ * 
+ * <h3>Step 2.</h3>
+ * <p>In Step 2, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldTermStack}. Fast Vector Highlighter uses term vector data
+ * (must be stored {@link org.apache.lucene.document.FieldType#setStoreTermVectorOffsets(boolean)} and {@link org.apache.lucene.document.FieldType#setStoreTermVectorPositions(boolean)})
+ * to generate it. <code>FieldTermStack</code> keeps the terms in the user query.
+ * Therefore, in this sample case, Fast Vector Highlighter generates the following <code>FieldTermStack</code>:</p>
+ * <pre>
+ * FieldTermStack
+ * +------------------+
+ * |"Lucene"(0,6,0)   |
+ * +------------------+
+ * |"search"(12,18,3) |
+ * +------------------+
+ * |"library"(26,33,5)|
+ * +------------------+
+ * where : "termText"(startOffset,endOffset,position)
+ * </pre>
+ * <h3>Step 3.</h3>
+ * <p>In Step 3, Fast Vector Highlighter generates {@link org.apache.lucene.search.vectorhighlight.FieldPhraseList}
+ * by reference to <code>QueryPhraseMap</code> and <code>FieldTermStack</code>.</p>
+ * <pre>
+ * FieldPhraseList
+ * +----------------+-----------------+---+
+ * |"Lucene"        |[(0,6)]          |w=2|
+ * +----------------+-----------------+---+
+ * |"search library"|[(12,18),(26,33)]|w=1|
+ * +----------------+-----------------+---+
+ * </pre>
+ * <p>The type of each entry is <code>WeightedPhraseInfo</code> that consists of
+ * an array of terms offsets and weight. 
+ * </p>
+ * <h3>Step 4.</h3>
+ * <p>In Step 4, Fast Vector Highlighter creates <code>FieldFragList</code> by reference to
+ * <code>FieldPhraseList</code>. In this sample case, the following
+ * <code>FieldFragList</code> will be generated:</p>
+ * <pre>
+ * FieldFragList
+ * +---------------------------------+
+ * |"Lucene"[(0,6)]                  |
+ * |"search library"[(12,18),(26,33)]|
+ * |totalBoost=3                     |
+ * +---------------------------------+
+ * </pre>
+ * 
+ * <p>
+ * The calculation for each <code>FieldFragList.WeightedFragInfo.totalBoost</code> (weight)  
+ * depends on the implementation of <code>FieldFragList.add( ... )</code>:
+ * <pre class="prettyprint">
+ *   public void add( int startOffset, int endOffset, List&lt;WeightedPhraseInfo&gt; phraseInfoList ) {
+ *     float totalBoost = 0;
+ *     List&lt;SubInfo&gt; subInfos = new ArrayList&lt;SubInfo&gt;();
+ *     for( WeightedPhraseInfo phraseInfo : phraseInfoList ){
+ *       subInfos.add( new SubInfo( phraseInfo.getText(), phraseInfo.getTermsOffsets(), phraseInfo.getSeqnum() ) );
+ *       totalBoost += phraseInfo.getBoost();
+ *     }
+ *     getFragInfos().add( new WeightedFragInfo( startOffset, endOffset, subInfos, totalBoost ) );
+ *   }
+ *   
+ * </pre>
+ * The used implementation of <code>FieldFragList</code> is noted in <code>BaseFragListBuilder.createFieldFragList( ... )</code>:
+ * <pre class="prettyprint">
+ *   public FieldFragList createFieldFragList( FieldPhraseList fieldPhraseList, int fragCharSize ){
+ *     return createFieldFragList( fieldPhraseList, new SimpleFieldFragList( fragCharSize ), fragCharSize );
+ *   }
+ * </pre>
+ * <p>
+ * Currently there are basically to approaches available:
+ * </p>
+ * <ul>
+ * <li><code>SimpleFragListBuilder using SimpleFieldFragList</code>: <i>sum-of-boosts</i>-approach. The totalBoost is calculated by summarizing the query-boosts per term. Per default a term is boosted by 1.0</li>
+ * <li><code>WeightedFragListBuilder using WeightedFieldFragList</code>: <i>sum-of-distinct-weights</i>-approach. The totalBoost is calculated by summarizing the IDF-weights of distinct terms.</li>
+ * </ul> 
+ * <p>Comparison of the two approaches:</p>
+ * <table border="1">
+ * <caption>
+ *   query = das alte testament (The Old Testament)
+ * </caption>
+ * <tr><th>Terms in fragment</th><th>sum-of-distinct-weights</th><th>sum-of-boosts</th></tr>
+ * <tr><td>das alte testament</td><td>5.339621</td><td>3.0</td></tr>
+ * <tr><td>das alte testament</td><td>5.339621</td><td>3.0</td></tr>
+ * <tr><td>das testament alte</td><td>5.339621</td><td>3.0</td></tr>
+ * <tr><td>das alte testament</td><td>5.339621</td><td>3.0</td></tr>
+ * <tr><td>das testament</td><td>2.9455688</td><td>2.0</td></tr>
+ * <tr><td>das alte</td><td>2.4759595</td><td>2.0</td></tr>
+ * <tr><td>das das das das</td><td>1.5015357</td><td>4.0</td></tr>
+ * <tr><td>das das das</td><td>1.3003681</td><td>3.0</td></tr>
+ * <tr><td>das das</td><td>1.061746</td><td>2.0</td></tr>
+ * <tr><td>alte</td><td>1.0</td><td>1.0</td></tr>
+ * <tr><td>alte</td><td>1.0</td><td>1.0</td></tr>
+ * <tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
+ * <tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
+ * <tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
+ * <tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
+ * <tr><td>das</td><td>0.7507678</td><td>1.0</td></tr>
+ * </table>
+ * 
+ * <h3>Step 5.</h3>
+ * <p>In Step 5, by using <code>FieldFragList</code> and the field stored data,
+ * Fast Vector Highlighter creates highlighted snippets!</p>
+ */
+package org.apache.lucene.search.vectorhighlight;