You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by gs...@apache.org on 2006/11/27 01:00:49 UTC
svn commit: r479465 [3/4] - in /lucene/java/trunk: docs/ docs/images/
docs/lucene-sandbox/ docs/styles/ src/site/ src/site/src/
src/site/src/documentation/ src/site/src/documentation/classes/
src/site/src/documentation/conf/ src/site/src/documentation/...
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/index.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/index.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/index.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/index.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,186 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - Overview
+ </title>
+ </header>
+ <properties>
+ <author email="jon at latchkey.com">Jon S. Stevens</author>
+ <author email="husted at apache.org">Ted Husted</author>
+ <author email="cutting at apache.org">Doug Cutting</author>
+ <author email="carlson at apache.org">Peter Carlson</author>
+ </properties>
+ <body>
+ <section id="Apache Lucene">
+ <title>Apache Lucene</title>
+ <p>
+ Apache Lucene is a high-performance, full-featured text search engine
+ library written entirely in Java. It is a technology suitable for nearly any
+ application that requires full-text search, especially cross-platform.
+ </p>
+ <p>
+ Apache Lucene is an open source project available for
+ <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">free download</a>.
+ Please use the links on the left to access Lucene.
+ </p>
+ </section>
+
+ <section id="Lucene News">
+ <title>Lucene News</title>
+ <section><title>10 November 2006</title>
+ <p>New <a href="http://forrest.apache.org">Forrest</a> based site released. The Lucene Java website now has a consistent look and feel with it's <a href="http://lucene.apache.org">Lucene</a> siblings.
+ </p>
+ </section>
+ <section>
+ <title>26 May 2006 - Release 2.0.0 available</title>
+
+ <p>This is mostly a bugfix release from release 1.9.1.
+ Note however that deprecated 1.x features have now
+ been removed. Any code that compiles against Lucene
+ 1.9.1 without deprecation warnings should work without
+ further changes with any 2.x release. For more
+ information about this release, please read
+ <a
+ href="http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_0_0/CHANGES.txt">
+ CHANGES.txt</a>
+ .
+ </p>
+
+ <p>Binary and source distributions are
+ available
+ <a
+ href="http://www.apache.org/dyn/closer.cgi/lucene/java/">here</a>
+ .
+ </p>
+ </section>
+ <section>
+ <title>2 March 2006 - Release 1.9.1 available</title>
+
+ <p>This fixes a serious bug in release 1.9-final.
+ <a
+ href="http://svn.apache.org/repos/asf/lucene/java/tags/lucene_1_9_1/CHANGES.txt">
+ CHANGES.txt</a>
+ for details.
+ </p>
+
+ <p>Binary and source distributions are
+ available
+ <a
+ href="http://www.apache.org/dyn/closer.cgi/lucene/java/">here</a>
+ .
+ </p>
+ </section>
+ <section>
+ <title>27 February 2006 - 1.9 final available</title>
+
+ <p>This release has many improvements since release
+ 1.4.3, including new features, performance
+ improvements, bug fixes, etc. See
+ <a
+ href="http://svn.apache.org/repos/asf/lucene/java/tags/lucene_1_9_final/CHANGES.txt">
+ CHANGES.txt</a>
+ for details.
+ </p>
+
+ <p>1.9 will be the last 1.x release. It is both
+ back-compatible with 1.4.3 and forward-compatible with
+ the upcoming 2.0 release. Many methods and classes in
+ 1.4.3 have been deprecated in 1.9 and will be removed
+ in 2.0. Applications must compile against 1.9 without
+ deprecation warnings before they are compatible with
+ 2.0.</p>
+
+ <p>Binary and source distributions are
+ available
+ <a
+ href="http://www.apache.org/dyn/closer.cgi/lucene/java/">here</a>
+ .
+ </p>
+ </section>
+ <section>
+ <title>26 January 2006 - Nightly builds available</title>
+
+ <p>Nightly builds of the current development version of Lucene, to be released as Lucene 1.9,
+ are now available at
+ <a href="http://cvs.apache.org/dist/lucene/java/nightly/">
+ http://cvs.apache.org/dist/lucene/java/nightly/</a>
+ .
+ </p>
+
+ <title>28 October 2005 - Lucene at ApacheCon</title>
+ <p>
+ <a href="http://www.apachecon.com">
+ <img src="http://apachecon.com/2005/US/logos/Conference135x59.jpg"/>
+ </a>
+ </p>
+ <p>Monday, December 12, 2005 at 3pm by Grant Ingersoll:
+ <br/>
+ Abstract:
+ <br/>
+ Lucene is a high performance, scalable, cross-platform search engine that contains many advanced
+ features that often go untapped by the majority of users. In this session, designed for those
+ familiar with Lucene, we will examine some of Lucene's more advanced topics and their application,
+ including:
+ </p>
+ <ol>
+ <li>Term Vectors: Manual and Pseudo relevance feedback; Advanced document collection analysis for
+ domain specialization</li>
+ <li>Span Queries: Better phrase matching; Candidate Identification for Question Answering</li>
+ <li>Tying it all Together: Building a search framework for experimentation and rapid deployment</li>
+ <li>Case Studies from
+ <a href="http://www.cnlp.org">CNLP</a>
+ : Crosslingual/multilingual retrieval in Arabic, English and Dutch;
+ Sublanguage specialization for commercial trouble ticket analysis; Passage retrieval and
+ analysis for Question Answering application
+ </li>
+ </ol>
+ <p>Topics 1 through 3 will provide technical details on implementing the advanced Lucene features, while
+ the fourth topic will provide a broader context for understanding when and where to use these
+ features.
+ </p>
+ </section>
+ <section>
+ <title>14 February 2005 - Lucene moves to Apache top-level</title>
+
+ <p>Lucene has migrated from Apache's Jakarta project to the top-level. Along with this migration,
+ the source code repository has been converted to Subversion. The migration is in progress with
+ some loose ends. Please stay tuned!
+ </p>
+ </section>
+ <section>
+ <title>December 2004 -
+ <em>Lucene in Action</em>
+ is published
+ </title>
+
+ <a href="http://www.lucenebook.com/">
+ <img border="0" align="left"
+ src="images/lia_3d.jpg"/>
+ </a>
+ <p>The first book dedicated solely to Lucene is published. The
+ "search inside the book" feature implemented with Lucene can
+ be seen at
+ <a href="http://www.lucenebook.com/">lucenebook.com</a>
+ .
+ </p>
+ </section>
+ <p style="clear: both;"/>
+ <section>
+ <title>29 November 2004 - Lucene 1.4.3 Released</title>
+
+ <p>This fixes a few bugs in 1.4.2. See
+ <a
+ href="http://svn.apache.org/repos/asf/lucene/java/tags/lucene_1_4_3/CHANGES.txt">
+ CHANGES.txt</a>
+ for details. Binary and source distributions are
+ available
+ <a href="http://www.apache.org/dyn/closer.cgi/lucene/">here</a>
+ . After choosing your mirror, navigate to the archive section via the java link.
+ </p>
+ </section>
+
+ </section>
+
+ </body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/lucene-sandbox/index.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/lucene-sandbox/index.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/lucene-sandbox/index.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/lucene-sandbox/index.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,148 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - Lucene Sandbox
+ </title>
+ </header>
+ <properties>
+ <author>Otis Gospodentic</author>
+ </properties>
+ <body>
+
+ <section id="Lucene Sandbox"><title>Lucene Sandbox</title>
+ <p>
+ Lucene project also contains a workspace, Lucene Sandbox, that is open to all Lucene committers, as well
+ as a few other developers. The purpose of the Sandbox is to host various third party contributions,
+ and to serve as a place to try out new ideas and prepare them for inclusion into the core Lucene
+ distribution.<br/>
+ Users are free to experiment with the components developed in the Sandbox, but Sandbox components will
+ not necessarily be maintained, particularly in their current state.
+ </p>
+
+ <p>
+ You can access the Lucene Sandbox repository at
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/">http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/</a>.
+ </p>
+
+ <section id="Snowball Stemmers for Lucene"><title>Snowball Stemmers for Lucene</title>
+ <p>
+ This project provides pre-compiled versions of the Snowball stemmers
+ for Lucene.
+ </p>
+
+ <p>
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/snowball">The
+ repository for the Snowball contribution.</a>
+ </p>
+
+ <p>
+ <a href="http://snowball.tartarus.org/">Background information on Snowball</a>,
+ which is a language for stemmers developed by Martin Porter.
+ </p>
+ </section>
+
+ <section id="Analyzers, Tokenizers, Filters"><title>Analyzers, Tokenizers, Filters</title>
+ <p>
+ Contributed Analyzers, Tokenizers, and Filters for various languages.
+ </p>
+
+ <p>
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/">The
+ repository for the Analyzers contribution.</a>
+ </p>
+ </section>
+
+ <section id="Ant"><title>Ant</title>
+ <p>
+ The Ant project is a useful Ant task that creates a Lucene index out of an Ant fileset. It also
+ contains an example HTML parser that uses JTidy.
+ </p>
+ <p>
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/ant/">The
+ repository for the Ant contribution.</a>
+ </p>
+ </section>
+
+ <section id="WordNet/Synonyms"><title>WordNet/Synonyms</title>
+ <p>
+ The Lucene WordNet code consists of a single class which parses a prolog file
+ from the WordNet site that contains a list of English words and synonyms.
+ The class builds a Lucene index from the synonyms file. Your querying code could
+ hit this index to build up a set of synonyms for the terms in the
+ search query.
+ </p>
+ <p>
+ More information on the <a href="http://www.tropo.com/techno/java/lucene/wordnet.html">Lucene WordNet package</a>.
+ <a href="http://wordnet.princeton.edu/">WordNet</a> is an online database of English language words that contains
+ synonyms, definitions, and various relationships between synonym sets.
+ </p>
+ <p>
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/wordnet/">The
+ repository for the WordNet module.</a>
+ </p>
+ </section>
+
+ <section id="Lucli - Lucene Command-line Interface"><title>Lucli - Lucene Command-line Interface</title>
+ <p>
+ The Lucli application allows index manipulation from the
+ command-line.
+ </p>
+
+ <p>
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/lucli/">The
+ repository for the Lucli contribution.</a>
+ </p>
+ </section>
+
+ <section id="Term Highlighter"><title>Term Highlighter</title>
+ <p>
+ A small set of classes for highlighting matching terms in
+ search results.
+ </p>
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/highlighter/">The
+ repository for the Highlighter contribution.</a>
+ </section>
+
+ <section id="Javascript Query Constructor"><title>Javascript Query Constructor</title>
+ <p>
+ Javascript library to support client-side query-building. Provides support for a user interface similar to
+ <a href="http://www.google.com.sg/advanced_search">Google's Advanced Search</a>.
+ </p>
+ <p>
+
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/javascript/queryConstructor/">The
+ repository for the Javascript Query Constructor files.</a>
+ </p>
+ </section>
+
+ <section id="Javascript Query Validator"><title>Javascript Query Validator</title>
+ <p>
+ Javascript library to support client-side query validation. Lucene doesn't like malformed queries and tends to
+ throw ParseException, which are often difficult to interpret and pass on to the user. This library hopes to
+ alleviate that problem.
+ </p>
+ <p>
+
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/javascript/queryValidator/">The
+ repository for the Javascript Query Validator files.</a>
+ </p>
+ </section>
+
+ <section id="High Frequency Terms"><title>High Frequency Terms</title>
+ <p>
+ The miscellaneous package is for classes that don't fit anywhere else. The only class in it right now determines
+ what terms occur the most inside a Lucene index. This could be useful for analyzing which terms may need to go
+ into a custom stop word list for better search results.
+ </p>
+ <p>
+
+ <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/miscellaneous/">The
+ repository for miscellaneous classes.</a>
+ </p>
+ </section>
+
+ </section>
+
+ </body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/mailinglists.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/mailinglists.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/mailinglists.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/mailinglists.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,101 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - Mailing Lists
+ </title>
+ </header>
+ <body>
+ <section id="Java User List"><title>Java User List</title>
+ <p>
+ This list is for users of Java Lucene to ask questions, share knowledge,
+ and discuss issues.
+ </p>
+ <ul>
+ <li><a href="mailto:java-user-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:java-user-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-java-user/">Archive</a>
+ (<a href="http://mail-archives.apache.org/mod_mbox/jakarta-lucene-user/">old archive</a>)</li>
+ <li><a href="http://www.gossamer-threads.com/lists/lucene/java-user/">Alternative
+ archive with search feature</a></li>
+ </ul>
+ </section>
+
+ <section id="Java Developer List"><title>Java Developer List</title>
+ <p>
+ This is the list where participating developers of the Java Lucene project meet
+ and discuss issues, code changes/additions, etc. Do not send mail to this list
+ with usage questions or configuration questions and problems.
+ </p>
+ <p>
+ Discussion list:
+ <ul>
+ <li><a href="mailto:java-dev-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:java-dev-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-java-dev/">Archive</a>
+ (<a href="http://mail-archives.apache.org/mod_mbox/jakarta-lucene-dev/">old archive</a>)</li>
+ <li><a href="http://www.gossamer-threads.com/lists/lucene/java-dev/">Alternative
+ archive with search feature</a></li>
+ </ul><br/>
+ Commit notifications:
+ <ul>
+ <li><a href="mailto:java-commits-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:java-commits-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-java-commits/">Archive</a></li>
+ </ul>
+ </p>
+ </section>
+
+ <section id="Lucene4c Developer List"><title>Lucene4c Developer List</title>
+ <p>
+ This is the list where participating developers of the lucene4c
+ project meet and discuss issues related to development of
+ lucene4c. Do not send mail to this list with usage or
+ configuration questions and problems.
+ </p>
+ <p>
+ Discussion list:
+ <ul>
+ <li><a href="mailto:c-dev-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:c-dev-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-c-dev/">Archive</a></li>
+ <li><a href="http://www.gossamer-threads.com/lists/lucene/c-dev/">Alternative
+ archive with search feature</a></li>
+ </ul><br/>
+ Commit notifications:
+ <ul>
+ <li><a href="mailto:c-commits-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:c-commits-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-c-commits/">Archive</a></li>
+ </ul>
+ </p>
+ </section>
+
+ <section id="Ruby Developer List"><title>Ruby Developer List</title>
+ <p>
+ Discussion list for developers of Ruby/SWIG Lucene.
+ </p>
+ <ul>
+ <li><a href="mailto:ruby-dev-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:ruby-dev-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-ruby-dev/">Archive</a></li>
+ <li><a href="http://www.gossamer-threads.com/lists/lucene/ruby-dev/">Alternative
+ archive with search feature</a></li>
+ </ul>
+ </section>
+
+ <section id="General Lucene List"><title>General Lucene List</title>
+ <p>
+ General discussion concerning all Lucene subprojects.
+ </p>
+ <ul>
+ <li><a href="mailto:general-subscribe@lucene.apache.org">Subscribe</a></li>
+ <li><a href="mailto:general-unsubscribe@lucene.apache.org">Unsubscribe</a></li>
+ <li><a href="http://mail-archives.apache.org/mod_mbox/lucene-general/">Archive</a></li>
+ <li><a href="http://www.gossamer-threads.com/lists/lucene/general/">Alternative
+ archive with search feature</a></li>
+ </ul>
+ </section>
+
+ </body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/queryparsersyntax.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/queryparsersyntax.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/queryparsersyntax.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/queryparsersyntax.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,236 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - Query Parser Syntax
+ </title>
+ </header>
+ <properties>
+ <author email="carlson@apache.org">Peter Carlson</author>
+ </properties>
+ <body>
+ <section id="Overview">
+ <title>Overview</title>
+ <p>Although Lucene provides the ability to create your own
+ queries through its API, it also provides a rich query
+ language through the Query Parser, a lexer which
+ interprets a string into a Lucene Query using JavaCC.
+ </p>
+
+ <p>This page provides the Query Parser syntax in Lucene 1.9.
+ If you are using a different
+ version of Lucene, please consult the copy of
+ <code>docs/queryparsersyntax.html</code> that was distributed
+ with the version you are using.
+ </p>
+ <p>
+ Before choosing to use the provided Query Parser, please consider the following:
+ <ol>
+ <li>If you are programmatically generating a query string and then
+ parsing it with the query parser then you should seriously consider building
+ your queries directly with the query API. In other words, the query
+ parser is designed for human-entered text, not for program-generated
+ text.</li>
+
+ <li>Untokenized fields are best added directly to queries, and not
+ through the query parser. If a field's values are generated programmatically
+ by the application, then so should query clauses for this field.
+ An analyzer, which the query parser uses, is designed to convert human-entered
+ text to terms. Program-generated values, like dates, keywords, etc.,
+ should be consistently program-generated.</li>
+
+ <li>In a query form, fields which are general text should use the query
+ parser. All others, such as date ranges, keywords, etc. are better added
+ directly through the query API. A field with a limit set of values,
+ that can be specified with a pull-down menu should not be added to a
+ query string which is subsequently parsed, but rather added as a
+ TermQuery clause.</li>
+ </ol>
+ </p>
+ </section>
+
+ <section id="Terms">
+ <title>Terms</title>
+ <p>A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.</p>
+ <p>A Single Term is a single word such as "test" or "hello".</p>
+ <p>A Phrase is a group of words surrounded by double quotes such as "hello dolly".</p>
+ <p>Multiple terms can be combined together with Boolean operators to form a more complex query (see below).</p>
+ <p>Note: The analyzer used to create the index will be used on the terms and phrases in the query string.
+ So it is important to choose an analyzer that will not interfere with the terms used in the query string.</p>
+ </section>
+
+ <section id="Fields">
+ <title>Fields</title>
+ <p>Lucene supports fielded data. When performing a search you can either specify a field, or use the default field. The field names and default field is implementation specific.</p>
+ <p>You can search any field by typing the field name followed by a colon ":" and then the term you are looking for. </p>
+ <p>As an example, let's assume a Lucene index contains two fields, title and text and text is the default field.
+ If you want to find the document entitled "The Right Way" which contains the text "don't go this way", you can enter: </p>
+
+ <source>title:"The Right Way" AND text:go</source>
+ <p>or</p>
+ <source>title:"Do it right" AND right</source>
+ <p>Since text is the default field, the field indicator is not required.</p>
+
+ <p>Note: The field is only valid for the term that it directly precedes, so the query</p>
+ <source>title:Do it right</source>
+ <p>Will only find "Do" in the title field. It will find "it" and "right" in the default field (in this case the text field). </p>
+ </section>
+
+ <section id="Term Modifiers">
+ <title>Term Modifiers</title>
+ <p>Lucene supports modifying query terms to provide a wide range of searching options.</p>
+
+ <section id="Wildcard Searches">
+ <title>Wildcard Searches</title>
+ <p>Lucene supports single and multiple character wildcard searches.</p>
+ <p>To perform a single character wildcard search use the "?" symbol.</p>
+ <p>To perform a multiple character wildcard search use the "*" symbol.</p>
+ <p>The single character wildcard search looks for terms that match that with the single character replaced. For example, to search for "text" or "test" you can use the search:</p>
+
+ <source>te?t</source>
+
+ <p>Multiple character wildcard searches looks for 0 or more characters. For example, to search for test, tests or tester, you can use the search: </p>
+ <source>test*</source>
+ <p>You can also use the wildcard searches in the middle of a term.</p>
+ <source>te*t</source>
+ <p>Note: You cannot use a * or ? symbol as the first character of a search.</p>
+ </section>
+
+
+ <section id="Fuzzy Searches">
+ <title>Fuzzy Searches</title>
+ <p>Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. For example to search for a term similar in spelling to "roam" use the fuzzy search: </p>
+
+ <source>roam~</source>
+ <p>This search will find terms like foam and roams.</p>
+
+ <p>Starting with Lucene 1.9 an additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example:</p>
+ <source>roam~0.8</source>
+ <p>The default that is used if the parameter is not given is 0.5.</p>
+ </section>
+
+
+ <section id="Proximity Searches">
+ <title>Proximity Searches</title>
+ <p>Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: </p>
+
+ <source>"jakarta apache"~10</source>
+ </section>
+
+
+ <section id="Range Searches">
+ <title>Range Searches</title>
+ <p>Range Queries allow one to match documents whose field(s) values
+ are between the lower and upper bound specified by the Range Query.
+ Range Queries can be inclusive or exclusive of the upper and lower bounds.
+ Sorting is done lexicographically.</p>
+ <source>mod_date:[20020101 TO 20030101]</source>
+ <p>This will find documents whose mod_date fields have values between 20020101 and 20030101, inclusive.
+ Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:</p>
+ <source>title:{Aida TO Carmen}</source>
+ <p>This will find all documents whose titles are between Aida and Carmen, but not including Aida and Carmen.</p>
+ <p>Inclusive range queries are denoted by square brackets. Exclusive range queries are denoted by
+ curly brackets.</p>
+ </section>
+
+
+ <section id="Boosting a Term">
+ <title>Boosting a Term</title>
+ <p>Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.</p>
+ <p>Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for</p>
+
+ <source>jakarta apache</source>
+ <p>and you want the term "jakarta" to be more relevant boost it using the ^ symbol along with the boost factor next to the term.
+ You would type:</p>
+ <source>jakarta^4 apache</source>
+ <p>This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: </p>
+
+ <source>"jakarta apache"^4 "Apache Lucene"</source>
+ <p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
+ </section>
+
+ </section>
+
+
+ <section id="Boolean operators">
+ <title>Boolean Operators</title>
+ <p>Boolean operators allow terms to be combined through logic operators.
+ Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).</p>
+
+ <section id="OR">
+ <p>The OR operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the OR operator is used.
+ The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets.
+ The symbol || can be used in place of the word OR.</p>
+ <p>To search for documents that contain either "jakarta apache" or just "jakarta" use the query:</p>
+
+ <source>"jakarta apache" jakarta</source>
+
+ <p>or</p>
+
+ <source>"jakarta apache" OR jakarta</source>
+
+ </section>
+ <section id="AND">
+ <title>AND</title>
+ <p>The AND operator matches documents where both terms exist anywhere in the text of a single document.
+ This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.</p>
+ <p>To search for documents that contain "jakarta apache" and "Apache Lucene" use the query: </p>
+
+ <source>"jakarta apache" AND "Apache Lucene"</source>
+ </section>
+
+ <section id="+">
+ <title>+</title>
+ <p>The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document.</p>
+ <p>To search for documents that must contain "jakarta" and may contain "lucene" use the query:</p>
+
+ <source>+jakarta apache</source>
+ </section>
+
+ <section id="NOT">
+ <title>NOT</title>
+ <p>The NOT operator excludes documents that contain the term after NOT.
+ This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT.</p>
+ <p>To search for documents that contain "jakarta apache" but not "Apache Lucene" use the query: </p>
+
+ <source>"jakarta apache" NOT "Apache Lucene"</source>
+ <p>Note: The NOT operator cannot be used with just one term. For example, the following search will return no results:</p>
+
+ <source>NOT "jakarta apache"</source>
+ </section>
+
+ <section id="-">
+ <title>-</title>
+ <p>The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.</p>
+ <p>To search for documents that contain "jakarta apache" but not "Apache Lucene" use the query: </p>
+
+ <source>"jakarta apache" -"Apache Lucene"</source>
+ </section>
+
+ </section>
+
+ <section id="Grouping">
+ <title>Grouping</title>
+ <p>Lucene supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query.</p>
+ <p>To search for either "jakarta" or "apache" and "website" use the query:</p>
+ <source>(jakarta OR apache) AND website</source>
+ <p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p>
+ </section>
+
+ <section id="Field Grouping">
+ <title>Field Grouping</title>
+ <p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
+ <p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
+ <source>title:(+return +"pink panther")</source>
+ </section>
+
+ <section id="Escaping Special Characters">
+ <title>Escaping Special Characters</title>
+ <p>Lucene supports escaping special characters that are part of the query syntax. The current list special characters are</p>
+ <p>+ - && || ! ( ) { } [ ] ^ " ~ * ? : \</p>
+ <p>To escape these character use the \ before the character. For example to search for (1+1):2 use the query:</p>
+ <source>\(1\+1\)\:2</source>
+ </section>
+
+ </body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/releases.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/releases.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/releases.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/releases.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,35 @@
+<?xml version="1.0"?>
+<document>
+<header><title>Apache Lucene - Downloads and Releases</title></header>
+<properties>
+<author email="gsingers@apache.org">Grant Ingersoll</author>
+</properties>
+<body>
+
+<section id="Downloads"><title>Downloads and Releases</title>
+<p>Information on Lucene Java Downloads and Releases.</p>
+ <section id="Official"><title>Official Release</title>
+ <p>Official releases are usually created when the <a href="whoweare.html">developers</a> feel there are
+ sufficient changes, improvements and bug fixes to warrant a release.
+ Due to the voluntary nature of Lucene, no releases are scheduled in advance.</p>
+ <p>Both binary and source releases are available for
+ <a href="http://www.apache.org/dyn/closer.cgi/lucene/java/">download from the Apache Mirrors</a></p>
+ </section>
+ <section id="Nightly"><title>Nightly Build Download</title>
+ <p>Nightly builds are based on the trunk version of the code checked into
+ <a href="https://svn.apache.org/repos/asf/lucene/java/trunk">SVN</a></p>
+ <a href="http://people.apache.org/dist/lucene/java/nightly/">Download</a>
+ </section>
+ <section id="source"><title>Source Code</title>
+ <p>Subversion
+ The sourcefiles are now stored using Subversion (see http://subversion.tigris.org/ and http://svnbook.red-bean.com/)
+ </p><p>
+ <code>svn checkout http://svn.apache.org/repos/asf/lucene/java/trunk lucene/java/trunk</code>
+ </p>
+
+ </section>
+</section>
+
+
+</body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/resources.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/resources.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/resources.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/resources.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,21 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - Resources
+ </title>
+ </header>
+ <properties>
+ <author email="cutting@apache.org">Doug Cutting</author>
+ <title>Resources - Apache Lucene</title>
+ </properties>
+ <body>
+
+ <section id="Page moved"><title>Page moved</title>
+
+ <a href="http://wiki.apache.org/jakarta-lucene/Resources">This page is now part of the Wiki</a>
+
+ </section>
+
+ </body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/scoring.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/scoring.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/scoring.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/scoring.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,291 @@
+<?xml version="1.0"?>
+
+<document>
+ <header>
+ <title>
+ Apache Lucene - Scoring
+ </title>
+ </header>
+ <properties>
+ <author email="gsingers at apache.org">Grant Ingersoll</author>
+ </properties>
+
+ <body>
+
+ <section id="Introduction"><title>Introduction</title>
+ <p>Lucene scoring is the heart of why we all love Lucene. It is blazingly fast and it hides almost all of the complexity from the user.
+ In a nutshell, it works. At least, that is, until it doesn't work, or doesn't work as one would expect it to
+ work. Then we are left digging into Lucene internals or asking for help on java-user@lucene.apache.org to figure out why a document with five of our query terms
+ scores lower than a different document with only one of the query terms. </p>
+ <p>While this document won't answer your specific scoring issues, it will, hopefully, point you to the places that can
+ help you figure out the what and why of Lucene scoring.</p>
+ <p>Lucene scoring uses a combination of the
+ <a href="http://en.wikipedia.org/wiki/Vector_Space_Model">Vector Space Model (VSM) of Information
+ Retrieval</a> and the <a href="http://en.wikipedia.org/wiki/Standard_Boolean_model">Boolean model</a>
+ to determine
+ how relevant a given Document is to a User's query. In general, the idea behind the VSM is the more
+ times a query term appears in a document relative to
+ the number of times the term appears in all the documents in the collection, the more relevant that
+ document is to the query. It uses the Boolean model to first narrow down the documents that need to
+ be scored based on the use of boolean logic in the Query specification. Lucene also adds some
+ capabilities and refinements onto this model to support boolean and fuzzy searching, but it
+ essentially remains a VSM based system at the heart.
+ For some valuable references on VSM and IR in general refer to the
+ <a href="http://wiki.apache.org/jakarta-lucene/InformationRetrieval">Lucene Wiki IR references</a>.
+ </p>
+ <p>The rest of this document will cover <a href="#Scoring">Scoring</a> basics and how to change your
+ <a href="api/org/apache/lucene/search/Similarity.html">Similarity</a>. Next it will cover ways you can
+ customize the Lucene internals in <a href="#Changing your Scoring -- Expert Level">Changing your Scoring
+ -- Expert Level</a> which gives details on implementing your own
+ <a href="api/org/apache/lucene/search/Query.html">Query</a> class and related functionality. Finally, we
+ will finish up with some reference material in the <a href="#Appendix">Appendix</a>.
+ </p>
+ </section>
+ <section id="Scoring"><title>Scoring</title>
+ <p>Scoring is very much dependent on the way documents are indexed,
+ so it is important to understand indexing (see
+ <a href="gettingstarted.html">Apache Lucene - Getting Started Guide</a>
+ and the Lucene
+ <a href="fileformats.html">file formats</a>
+ before continuing on with this section.) It is also assumed that readers know how to use the
+ <a href="api/org/apache/lucene/search/Searcher.html#explain(Query query, int doc)">Searcher.explain(Query query, int doc)</a> functionality,
+ which can go a long way in informing why a score is returned.
+ </p>
+ <section id="Fields and Documents"><title>Fields and Documents</title>
+ <p>In Lucene, the objects we are scoring are
+ <a href="api/org/apache/lucene/document/Document.html">Documents</a>. A Document is a collection
+ of
+ <a href="api/org/apache/lucene/document/Field.html">Fields</a>. Each Field has semantics about how
+ it is created and stored (i.e. tokenized, untokenized, raw data, compressed, etc.) It is important to
+ note that Lucene scoring works on Fields and then combines the results to return Documents. This is
+ important because two Documents with the exact same content, but one having the content in two Fields
+ and the other in one Field will return different scores for the same query due to length normalization
+ (assumming the
+ <a href="api/org/apache/lucene/search/DefaultSimilarity.html">DefaultSimilarity</a>
+ on the Fields).
+ </p>
+ </section>
+ <section id="Score Boosting"><title>Score Boosting</title>
+ <p>Lucene allows influencing search results by "boosting" in more than one level:
+ <ul>
+ <li><b>Document level boosting</b>
+ - while indexing - by calling
+ <a href="api/org/apache/lucene/document/Document.html#setBoost(float)">document.setBoost()</a>
+ before a document is added to the index.
+ </li>
+ <li><b>Document's Field level boosting</b>
+ - while indexing - by calling
+ <a href="api/org/apache/lucene/document/Fieldable.html#setBoost(float)">field.setBoost()</a>
+ before adding a field to the document (and before adding the document to the index).
+ </li>
+ <li><b>Query level boosting</b>
+ - during search, by setting a boost on a query clause, calling
+ <a href="api/org/apache/lucene/search/Query.html#setBoost(float)">Query.setBoost()</a>.
+ </li>
+ </ul>
+ </p>
+ <p>Indexing time boosts are preprocessed for storage efficiency and written to
+ the directory (when writing the document) in a single byte (!) as follows:
+ For each field of a document, all boosts of that field
+ (i.e. all boosts under the same field name in that doc) are multiplied.
+ The result is multiplied by the boost of the document,
+ and also multiplied by a "field length norm" value
+ that represents the length of that field in that doc
+ (so shorter fields are automatically boosted up).
+ The result is decoded as a single byte
+ (with some precision loss of course) and stored in the directory.
+ The similarity object in effect at indexing computes the length-norm of the field.
+ </p>
+ <p>This composition of 1-byte representation of norms
+ (that is, indexing time multiplication of field boosts & doc boost & field-length-norm)
+ is nicely described in
+ <a href="api/org/apache/lucene/document/Fieldable.html#setBoost(float)">Fieldable.setBoost()</a>.
+ </p>
+ <p>Encoding and decoding of the resulted float norm in a single byte are done by the
+ static methods of the class Similarity:
+ <a href="api/org/apache/lucene/search/Similarity.html#encodeNorm(float)">encodeNorm()</a> and
+ <a href="api/org/apache/lucene/search/Similarity.html#decodeNorm(byte)">decodeNorm()</a>.
+ Due to loss of precision, it is not guaranteed that decode(encode(x)) = x,
+ e.g. decode(encode(0.89)) = 0.75.
+ At scoring (search) time, this norm is brought into the score of document
+ as <b>norm(t, d)</b>, as shown by the formula in
+ <a href="api/org/apache/lucene/search/Similarity.html">Similarity</a>.
+ </p>
+ </section>
+ <section id="Understanding the Scoring Formula"><title>Understanding the Scoring Formula</title>
+
+ <p>
+ This scoring formula is described in the
+ <a href="api/org/apache/lucene/search/Similarity.html">Similarity</a> class. Please take the time to study this formula, as it contains much of the information about how the
+ basics of Lucene scoring work, especially the
+ <a href="api/org/apache/lucene/search/TermQuery.html">TermQuery</a>.
+ </p>
+ </section>
+ <section id="The Big Picture"><title>The Big Picture</title>
+ <p>OK, so the tf-idf formula and the
+ <a href="api/org/apache/lucene/search/Similarity.html">Similarity</a>
+ is great for understanding the basics of Lucene scoring, but what really drives Lucene scoring are
+ the use and interactions between the
+ <a href="api/org/apache/lucene/search/Query.html">Query</a> classes, as created by each application in
+ response to a user's information need.
+ </p>
+ <p>In this regard, Lucene offers a wide variety of <a href="api/org/apache/lucene/search/Query.html">Query</a> implementations, most of which are in the
+ <a href="api/org/apache/lucene/search/package-summary.html">org.apache.lucene.search</a> package.
+ These implementations can be combined in a wide variety of ways to provide complex querying
+ capabilities along with
+ information about where matches took place in the document collection. The <a href="#Query Classes">Query</a>
+ section below
+ highlights some of the more important Query classes. For information on the other ones, see the
+ <a href="api/org/apache/lucene/search/package-summary.html">package summary</a>. For details on implementing
+ your own Query class, see <a href="#Changing your Scoring -- Expert Level">Changing your Scoring --
+ Expert Level</a> below.
+ </p>
+ <p>Once a Query has been created and submitted to the
+ <a href="api/org/apache/lucene/search/IndexSearcher.html">IndexSearcher</a>, the scoring process
+ begins. (See the <a
+ href="#Appendix">Appendix</a> Algorithm section for more notes on the process.) After some infrastructure setup,
+ control finally passes to the <a href="api/org/apache/lucene/search/Weight.html">Weight</a> implementation and its
+ <a href="api/org/apache/lucene/search/Scorer.html">Scorer</a> instance. In the case of any type of
+ <a href="api/org/apache/lucene/search/BooleanQuery.html">BooleanQuery</a>, scoring is handled by the
+ <a href="http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/BooleanQuery.java?view=log">BooleanWeight2</a> (link goes to ViewVC BooleanQuery java code which contains the BooleanWeight2 inner class),
+ unless the static
+ <a href="api/org/apache/lucene/search/BooleanQuery.html#setUseScorer14(boolean)">
+ BooleanQuery#setUseScorer14(boolean)</a> method is set to true,
+ in which case the
+ <a href="http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/BooleanQuery.java?view=log">BooleanWeight</a>
+ (link goes to ViewVC BooleanQuery java code, which contains the BooleanWeight inner class) from the 1.4 version of Lucene is used by default.
+ See <a href="http://svn.apache.org/repos/asf/lucene/java/trunk/CHANGES.txt">CHANGES.txt</a> under release 1.9 RC1 for more information on choosing which Scorer to use.
+ </p>
+ <p>
+ Assuming the use of the BooleanWeight2, a
+ BooleanScorer2 is created by bringing together
+ all of the
+ <a href="api/org/apache/lucene/search/Scorer.html">Scorer</a>s from the sub-clauses of the BooleanQuery.
+ When the BooleanScorer2 is asked to score it delegates its work to an internal Scorer based on the type
+ of clauses in the Query. This internal Scorer essentially loops over the sub scorers and sums the scores
+ provided by each scorer while factoring in the coord() score.
+ <!-- Do we want to fill in the details of the counting sum scorer, disjunction scorer, etc.? -->
+ </p>
+ </section>
+ <section id="Query Classes"><title>Query Classes</title>
+ <p>For information on the Query Classes, refer to the
+ <a href="api/org/apache/lucene/search/package-summary.html#query">search package javadocs</a>
+ </p>
+ </section>
+ <section id="Changing Similarity"><title>Changing Similarity</title>
+ <p>One of the ways of changing the scoring characteristics of Lucene is to change the similarity factors. For information on
+ how to do this, see the
+ <a href="api/org/apache/lucene/search/package-summary.html#changingSimilarity">search package javadocs</a></p>
+ </section>
+
+ </section>
+ <section id="Changing your Scoring -- Expert Level"><title>Changing your Scoring -- Expert Level</title>
+ <p>At a much deeper level, one can affect scoring by implementing their own Query classes (and related scoring classes.) To learn more
+ about how to do this, refer to the
+ <a href="api/org/apache/lucene/search/package-summary.html#scoring">search package javadocs</a>
+ </p>
+ </section>
+
+ <section id="Appendix"><title>Appendix</title>
+ <section id="Class Diagrams"><title>Class Diagrams</title>
+ <p>
+ <a href="http://wiki.apache.org/jakarta-lucene/KarlWettin?action=AttachFile&do=view&target=search_uml_1.jpg">
+ Karl Wettin's UML on the Wiki</a>
+ </p>
+ </section>
+ <section id="Sequence Diagrams"><title>Sequence Diagrams</title>
+ <p >FILL IN HERE. Volunteers?</p>
+ </section>
+ <section id="Algorithm"><title>Algorithm</title>
+ <p>This section is mostly notes on stepping through the Scoring process and serves as
+ fertilizer for the earlier sections.</p>
+ <p>In the typical search application, a
+ <a href="api/org/apache/lucene/search/Query.html">Query</a>
+ is passed to the
+ <a
+ href="api/org/apache/lucene/search/Searcher.html">Searcher</a>
+ , beginning the scoring process.
+ </p>
+ <p>Once inside the Searcher, a
+ <a href="api/org/apache/lucene/search/Hits.html">Hits</a>
+ object is constructed, which handles the scoring and caching of the search results.
+ The Hits constructor stores references to three or four important objects:
+ <ol>
+ <li>The
+ <a href="api/org/apache/lucene/search/Weight.html">Weight</a>
+ object of the Query. The Weight object is an internal representation of the Query that
+ allows the Query to be reused by the Searcher.
+ </li>
+ <li>The Searcher that initiated the call.</li>
+ <li>A
+ <a href="api/org/apache/lucene/search/Filter.html">Filter</a>
+ for limiting the result set. Note, the Filter may be null.
+ </li>
+ <li>A
+ <a href="api/org/apache/lucene/search/Sort.html">Sort</a>
+ object for specifying how to sort the results if the standard score based sort method is not
+ desired.
+ </li>
+ </ol>
+ </p>
+ <p>Now that the Hits object has been initialized, it begins the process of identifying documents that
+ match the query by calling getMoreDocs method. Assuming we are not sorting (since sorting doesn't
+ effect the raw Lucene score),
+ we call on the "expert" search method of the Searcher, passing in our
+ <a href="api/org/apache/lucene/search/Weight.html">Weight</a>
+ object,
+ <a href="api/org/apache/lucene/search/Filter.html">Filter</a>
+ and the number of results we want. This method
+ returns a
+ <a href="api/org/apache/lucene/search/TopDocs.html">TopDocs</a>
+ object, which is an internal collection of search results.
+ The Searcher creates a
+ <a href="api/org/apache/lucene/search/TopDocCollector.html">TopDocCollector</a>
+ and passes it along with the Weight, Filter to another expert search method (for more on the
+ <a href="api/org/apache/lucene/search/HitCollector.html">HitCollector</a>
+ mechanism, see
+ <a href="api/org/apache/lucene/search/Searcher.html">Searcher</a>
+ .) The TopDocCollector uses a
+ <a href="api/org/apache/lucene/util/PriorityQueue.html">PriorityQueue</a>
+ to collect the top results for the search.
+ </p>
+ <p>If a Filter is being used, some initial setup is done to determine which docs to include. Otherwise,
+ we ask the Weight for
+ a
+ <a href="api/org/apache/lucene/search/Scorer.html">Scorer</a>
+ for the
+ <a href="api/org/apache/lucene/index/IndexReader.html">IndexReader</a>
+ of the current searcher and we proceed by
+ calling the score method on the
+ <a href="api/org/apache/lucene/search/Scorer.html">Scorer</a>
+ .
+ </p>
+ <p>At last, we are actually going to score some documents. The score method takes in the HitCollector
+ (most likely the TopDocCollector) and does its business.
+ Of course, here is where things get involved. The
+ <a href="api/org/apache/lucene/search/Scorer.html">Scorer</a>
+ that is returned by the
+ <a href="api/org/apache/lucene/search/Weight.html">Weight</a>
+ object depends on what type of Query was submitted. In most real world applications with multiple
+ query terms,
+ the
+ <a href="api/org/apache/lucene/search/Scorer.html">Scorer</a>
+ is going to be a
+ <a href="http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/BooleanScorer2.java?view=log">BooleanScorer2</a>
+ (see the section on customizing your scoring for info on changing this.)
+
+ </p>
+ <p>Assuming a BooleanScorer2 scorer, we first initialize the Coordinator, which is used to apply the
+ coord() factor. We then
+ get a internal Scorer based on the required, optional and prohibited parts of the query.
+ Using this internal Scorer, the BooleanScorer2 then proceeds
+ into a while loop based on the Scorer#next() method. The next() method advances to the next document
+ matching the query. This is an
+ abstract method in the Scorer class and is thus overriden by all derived
+ implementations. <!-- DOUBLE CHECK THIS -->If you have a simple OR query
+ your internal Scorer is most likely a DisjunctionSumScorer, which essentially combines the scorers
+ from the sub scorers of the OR'd terms.</p>
+ </section>
+ </section>
+ </body>
+</document>
\ No newline at end of file
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/site.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/site.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/site.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/site.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,112 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation or its licensors,
+ as applicable.
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!--
+Forrest site.xml
+
+This file contains an outline of the site's information content. It is used to:
+- Generate the website menus (though these can be overridden - see docs)
+- Provide semantic, location-independent aliases for internal 'site:' URIs, eg
+<link href="site:changes"> links to changes.html (or ../changes.html if in
+ subdir).
+- Provide aliases for external URLs in the external-refs section. Eg, <link
+ href="ext:cocoon"> links to http://cocoon.apache.org/
+
+See http://forrest.apache.org/docs/linking.html for more info
+-->
+
+<site label="Lucene" href="" xmlns="http://apache.org/forrest/linkmap/1.0" tab="">
+ <!-- Note: No matter what you configure here, Forrest will always try to load
+ index.html when you request http://yourHost/.
+ 'How can I use a start-up-page other than index.html?' in the FAQs has more
+ information tells you how to change that.
+ -->
+ <about label="About">
+ <overview label="Overview" href="index.html" description="Welcome to Java Lucene"/>
+ <features label="Features" href="features.html"/>
+ <powered-by label="Powered by Lucene" href="ext:powered-by"/>
+ <who-we-are label="Who We Are" href="whoweare.html"/>
+ </about>
+ <!-- keep in submenu items alpha order -->
+ <docs label="Documentation">
+
+ <apidocs label="API Docs" href="api/"/>
+ <benchmarks label="Benchmarks" href="benchmarks.html"/>
+ <contributions label="Contributions" href="contributions.html"/>
+ <faq label="FAQ" href="ext:faq" />
+ <file-formats label="File Formats" href="fileformats.html"/>
+ <tutorial label="Getting Started" href="gettingstarted.html"/>
+ <lucene-sandbox label="Lucene Sandbox" href="lucene-sandbox/index.html"/>
+ <query-syntax label="Query Syntax" href="queryparsersyntax.html"/>
+ <scoring label="Scoring" href="scoring.html"/>
+ <wiki label="Wiki" href="ext:wiki" />
+ </docs>
+
+ <resources label="Resources">
+ <issues label="Issue Tracking" href="ext:issues"/>
+ <contact label="Mailing Lists" href="mailinglists.html"/>
+ <release label="Downloads" href="releases.html"/>
+ <svn label="Version Control" href="ext:source" />
+ </resources>
+ <versions label="Site Versions">
+ <official label="Official" href="./"/>
+<!-- Needs to be filled in -->
+<!-- <nightly label="Nightly" href=""/> -->
+
+ </versions>
+ <projects label="Related Projects">
+ <lucene label="Lucene (Top-Level)" href="ext:topLevel"/>
+ <lucene label="Hadoop" href="ext:hadoop"/>
+ <lucene label="Lucy" href="ext:lucy"/>
+ <lucene label="Lucene.NET" href="ext:lucene-net"/>
+ <lucene label="Nutch" href="ext:nutch" />
+ <lucene label="SOLR" href="ext:solr"/>
+ </projects>
+
+ <!--
+ The href must be wholesite.html/pdf You can change the labels and node names
+ <all label="All">
+ <whole_site_html label="Whole Site HTML" href="wholesite.html"/>
+ <whole_site_pdf label="Whole Site PDF" href="wholesite.pdf"/>
+ </all>
+ -->
+
+ <external-refs>
+ <forrest href="http://forrest.apache.org/">
+ <linking href="docs/linking.html"/>
+ <validation href="docs/validation.html"/>
+ <webapp href="docs/your-project.html#webapp"/>
+ <dtd-docs href="docs/dtd-docs.html"/>
+ </forrest>
+ <cocoon href="http://cocoon.apache.org/"/>
+ <xml.apache.org href="http://xml.apache.org/"/>
+ <issues href="http://issues.apache.org/jira/browse/LUCENE"/>
+ <topLevel href="http://lucene.apache.org"/>
+ <solr href="http://incubator.apache.org/solr/" />
+ <nutch href="http://lucene.apache.org/nutch/" />
+ <lucy href="http://lucene.apache.org/lucy/"/>
+ <lucene-net href="http://incubator.apache.org/projects/lucene.net.html"/>
+ <hadoop href="http://lucene.apache.org/hadoop/"/>
+ <wiki href="http://wiki.apache.org/jakarta-lucene" />
+ <faq href="http://wiki.apache.org/jakarta-lucene/LuceneFAQ" />
+ <releases href="http://www.apache.org/dyn/closer.cgi/lucene/java/" />
+ <source href="http://svn.apache.org/viewcvs.cgi/lucene/java/"/>
+ <powered-by href="http://wiki.apache.org/jakarta-lucene/PoweredBy"/>
+ </external-refs>
+
+</site>
Propchange: lucene/java/trunk/src/site/src/documentation/content/xdocs/site.xml
------------------------------------------------------------------------------
svn:executable = *
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/systemproperties.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/systemproperties.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/systemproperties.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/systemproperties.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,148 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - System Properties
+ </title>
+ </header>
+ <properties>
+ <author email="otis @ apache dot org">Otis Gospodnetić</author>
+ </properties>
+ <body>
+
+ <section id="About this Document"><title>About this Document</title>
+ <p>
+ Lucene has a number of properties that can be tuned. They can be adjusted either
+ programmatically, using the Lucene API, or their default values can be set via
+ system properties described in this document. Starting
+ with Lucene 1.9, the system properties (except org.apache.lucene.lockDir) are not supported
+ anymore and the API (i.e. the get/set methods) should be used directly.
+ </p>
+ </section>
+
+ <section id="System Properties"><title>System Properties</title>
+ <p>
+ <table width="100%" border="0" cellpadding="4" cellspacing="0">
+ <tr valign="top">
+ <td width="25%"><b>Lucene Property</b></td>
+ <td width="25%"><b>System Property</b></td>
+ <td width="25%"><b>Default Value</b></td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/IndexWriter.html#mergeFactor">mergeFactor</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.mergeFactor
+ </td>
+ <td width="25%">
+ 10
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/IndexWriter.html#minMergeDocs">minMergeDocs</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.minMergeDocs
+ </td>
+ <td width="25%">
+ 10
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/IndexWriter.html#maxMergeDocs">maxMergeDocs</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.maxMergeDocs
+ </td>
+ <td width="25%">
+ Integer.MAX_VALUE
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/IndexWriter.html#maxFieldLength">maxFieldLength</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.maxFieldLength
+ </td>
+ <td width="25%">
+ 10000
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/IndexWriter.html#COMMIT_LOCK_TIMEOUT">COMMIT_LOCK_TIMEOUT</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.commitLockTimeout
+ </td>
+ <td width="25%">
+ 10000 ms
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/IndexWriter.html#WRITE_LOCK_TIMEOUT">WRITE_LOCK_TIMEOUT</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.writeLockTimeout
+ </td>
+ <td width="25%">
+ 1000 ms
+ </td>
+ </tr>
+
+
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/search/BooleanQuery.html#maxClauseCount">maxClauseCount</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.maxClauseCount
+ </td>
+ <td width="25%">
+ 1024
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/store/FSDirectory.html#lockDir">lockDir</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.lockDir
+ </td>
+ <td width="25%">
+ the value of <code>java.io.tmpdir</code> system property
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/store/FSDirectory.html#FSDirectory.class">FSDirectory.class</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.FSDirectory.class
+ </td>
+ <td width="25%">
+ org.apache.lucene.store.FSDirectory
+ </td>
+ </tr>
+ <tr valign="TOP">
+ <td width="25%">
+ <a href="api/org/apache/lucene/index/SegmentReader.html#SegmentReader.class">SegmentReader.class</a>
+ </td>
+ <td width="25%">
+ org.apache.lucene.index.SegmentReader.class
+ </td>
+ <td width="25%">
+ org.apache.lucene.index.SegmentReader
+ </td>
+ </tr>
+ </table>
+ </p>
+ </section>
+
+ </body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/tabs.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/tabs.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/tabs.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/tabs.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,59 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Copyright 2002-2004 The Apache Software Foundation or its licensors,
+ as applicable.
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<!DOCTYPE tabs PUBLIC "-//APACHE//DTD Cocoon Documentation Tab V1.1//EN" "http://forrest.apache.org/dtd/tab-cocoon-v11.dtd">
+
+<tabs software="Java"
+ title="Java"
+ copyright="The Apache Software Foundation"
+ xmlns:xlink="http://www.w3.org/1999/xlink">
+
+ <!-- The rules for tabs are:
+ @dir will always have '/@indexfile' added.
+ @indexfile gets appended to @dir if the tab is selected. Defaults to 'index.html'
+ @href is not modified unless it is root-relative and obviously specifies a
+ directory (ends in '/'), in which case /index.html will be added
+ If @id's are present, site.xml entries with a matching @tab will be in that tab.
+
+ Tabs can be embedded to a depth of two. The second level of tabs will only
+ be displayed when their parent tab is selected.
+ -->
+
+ <!--
+ <tab id="" label="Home" dir="" indexfile="index.html"/>
+ -->
+
+ <tab id="" label="Main" dir=""/>
+ <tab id="wiki" label="Wiki" href="http://wiki.apache.org/jakarta-lucene"/>
+
+ <!--
+ <tab id="samples" label="Samples" dir="samples" indexfile="sample.html">
+ <tab id="samples-index" label="Index" dir="samples" indexfile="index.html"/>
+ <tab id="samples-sample2" label="Sample2" dir="samples" indexfile="static.html"/>
+ </tab>
+ <tab label="Apache XML Projects" href="http://xml.apache.org">
+ <tab label="Forrest" href="http://forrest.apache.org"/>
+ <tab label="Xerces" href="http://xml.apache.org/xerces"/>
+ </tab>
+ <tab id="plugins" label="Plugins" dir="pluginDocs/plugins_0_70" indexfile="index.html"/>
+ -->
+ <!-- Add new tabs here, eg:
+ <tab label="How-Tos" dir="community/howto/"/>
+ <tab label="XML Site" dir="xml-site/"/>
+ -->
+
+</tabs>
Propchange: lucene/java/trunk/src/site/src/documentation/content/xdocs/tabs.xml
------------------------------------------------------------------------------
svn:executable = *
Added: lucene/java/trunk/src/site/src/documentation/content/xdocs/whoweare.xml
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/whoweare.xml?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/content/xdocs/whoweare.xml (added)
+++ lucene/java/trunk/src/site/src/documentation/content/xdocs/whoweare.xml Sun Nov 26 16:00:46 2006
@@ -0,0 +1,64 @@
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>
+ Apache Lucene - Who We Are
+ </title>
+ </header>
+<properties>
+<author email="husted@apache.org">Ted Husted</author>
+<author email="cutting@apache.org">Doug Cutting</author>
+</properties>
+<body>
+
+<section id="Who We Are"><title>Who We Are</title>
+<p>Lucene is maintained by a team of volunteer developers.</p>
+</section>
+
+<section id="Committers"><title>Committers</title>
+<ul>
+<li><b><a href="http://www.nutch.org/blog/cutting.html">Doug Cutting</a></b> (cutting@...)
+
+<p>Lucene was originally written in Doug's spare time during late 1997
+and early 1998. Doug had previously written search engines at Xerox's
+Palo Alto Research Center (PARC), Apple, and Excite@Home, and authored
+several information retrieval <a
+href="http://lucene.sourceforge.net/publications.html">papers and
+patents</a>.</p>
+
+</li>
+<li><b><a href="http://www.jroller.com/page/otis">Otis Gospodnetic</a></b> (otis@...)</li>
+<li><b>Brian Goetz</b> (briangoetz@...)</li>
+<li><b>Scott Ganyo</b> (scottganyo@...)</li>
+<li><b>Eugene Gluzberg</b> (drag0n@...)</li>
+<li><b>Matt Tucker</b> (mtucker@...)</li>
+<li><b>Cory Hubert</b> (clhubert@...)</li>
+<li><b>Dave Kor</b> (davekor@...)</li>
+<li><b>Jon Stevens</b> (jon at latchkey.com)</li>
+<li><b>Tal Dayan</b> (zapta@...)</li>
+<li><b>Andrew C. Oliver</b> (acoliver@...)</li>
+<li><b>Peter Carlson</b> (carlson@...)</li>
+<li><b>Erik Hatcher</b> (ehatcher@...)</li>
+<li><b>Dmitry Serebrennikov</b> (dmitrys@...)</li>
+<li><b>Christoph Goller</b> (goller@...)</li>
+<li><b>Tim Jones</b> (tjones@...)</li>
+<li><b>Daniel Naber</b> (dnaber@...)</li>
+<li><b>Bernhard Messer</b> (bmesser@...)</li>
+<li><b>Yonik Seeley</b> (yonik@...)</li>
+<li><b>Grant Ingersoll</b> (gsingers@...) </li>
+<li><b>Mike McCandless</b> (mikemccand@...) </li>
+</ul>
+
+<p>Note that the email addresses above end with @apache.org.</p>
+
+</section>
+
+<section id="Other Contributors"><title>Other Contributors</title>
+<ul>
+<li>Josh Bloch</li>
+<li>Ted Husted</li>
+</ul>
+</section>
+
+</body>
+</document>
Added: lucene/java/trunk/src/site/src/documentation/sitemap.xmap
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/sitemap.xmap?view=auto&rev=479465
==============================================================================
--- lucene/java/trunk/src/site/src/documentation/sitemap.xmap (added)
+++ lucene/java/trunk/src/site/src/documentation/sitemap.xmap Sun Nov 26 16:00:46 2006
@@ -0,0 +1,72 @@
+<?xml version="1.0"?>
+<!--
+ Copyright 2002-2005 The Apache Software Foundation or its licensors,
+ as applicable.
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0">
+
+ <map:components>
+ <map:actions>
+ <map:action logger="sitemap.action.sourcetype" name="sourcetype" src="org.apache.forrest.sourcetype.SourceTypeAction">
+ <sourcetype name="hello-v1.0">
+ <document-declaration public-id="-//Acme//DTD Hello Document V1.0//EN" />
+ </sourcetype>
+ </map:action>
+ </map:actions>
+
+ <map:selectors default="parameter">
+ <map:selector logger="sitemap.selector.parameter" name="parameter" src="org.apache.cocoon.selection.ParameterSelector" />
+ </map:selectors>
+ </map:components>
+
+ <map:resources>
+ <map:resource name="transform-to-document">
+ <map:act type="sourcetype" src="{src}">
+ <map:select type="parameter">
+ <map:parameter name="parameter-selector-test" value="{sourcetype}" />
+
+ <map:when test="hello-v1.0">
+ <map:generate src="{project:content.xdocs}{../../1}.xml" />
+ <map:transform src="{project:resources.stylesheets}/hello2document.xsl" />
+ <map:serialize type="xml-document"/>
+ </map:when>
+ </map:select>
+ </map:act>
+ </map:resource>
+ </map:resources>
+
+ <map:pipelines>
+ <map:pipeline>
+ <map:match pattern="old_site/*.html">
+ <map:select type="exists">
+ <map:when test="{project:content}{1}.html">
+ <map:read src="{project:content}{1}.html" mime-type="text/html"/>
+ <!--
+ Use this instead if you want JTidy to clean up your HTML
+ <map:generate type="html" src="{project:content}/{0}" />
+ <map:serialize type="html"/>
+ -->
+ </map:when>
+ </map:select>
+ </map:match>
+
+ <map:match pattern="**.xml">
+ <map:call resource="transform-to-document">
+ <map:parameter name="src" value="{project:content.xdocs}{1}.xml" />
+ </map:call>
+ </map:match>
+ </map:pipeline>
+ </map:pipelines>
+</map:sitemap>
Propchange: lucene/java/trunk/src/site/src/documentation/sitemap.xmap
------------------------------------------------------------------------------
svn:executable = *