You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucy.apache.org by bu...@apache.org on 2011/08/24 02:26:09 UTC
[lucy-commits] svn commit: r794766 [2/4] - in /websites/staging/lucy/trunk/content/lucy/docs/perl: ./ Lucy/ Lucy/Analysis/ Lucy/Docs/ Lucy/Docs/Cookbook/ Lucy/Docs/Tutorial/ Lucy/Document/ Lucy/Highlight/ Lucy/Index/ Lucy/Object/ Lucy/Plan/ Lucy/Search/ Lucy/Search/C...

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Analysis.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Analysis.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Analysis.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,72 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::Analysis - How to choose and use Analyzers.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Try swapping out the PolyAnalyzer in our Schema for a RegexTokenizer:</p>
+
+<pre><code>    my $tokenizer = Lucy::Analysis::RegexTokenizer-&gt;new;
+    my $type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $tokenizer,
+    );</code></pre>
+
+<p>Search for <code>senate</code>, <code>Senate</code>, and <code>Senator</code> before and after making the change and re-indexing.</p>
+
+<p>Under PolyAnalyzer, the results are identical for all three searches, but under RegexTokenizer, searches are case-sensitive, and the result sets for <code>Senate</code> and <code>Senator</code> are distinct.</p>
+
+<h2 id="PolyAnalyzer">PolyAnalyzer</h2>
+
+<p>What&#39;s happening is that PolyAnalyzer is performing more aggressive processing than RegexTokenizer. In addition to tokenizing, it&#39;s also converting all text to lower case so that searches are case-insensitive, and using a &quot;stemming&quot; algorithm to reduce related words to a common stem (<code>senat</code>, in this case).</p>
+
+<p>PolyAnalyzer is actually multiple Analyzers wrapped up in a single package. In this case, it&#39;s three-in-one, since specifying a PolyAnalyzer with <code>language =&gt; &#39;en&#39;</code> is equivalent to this snippet:</p>
+
+<pre><code>    my $case_folder  = Lucy::Analysis::CaseFolder-&gt;new;
+    my $tokenizer    = Lucy::Analysis::RegexTokenizer-&gt;new;
+    my $stemmer      = Lucy::Analysis::SnowballStemmer-&gt;new( language =&gt; &#39;en&#39; );
+    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        analyzers =&gt; [ $case_folder, $tokenizer, $stemmer ], 
+    );</code></pre>
+
+<p>You can add or subtract Analyzers from there if you like. Try adding a fourth Analyzer, a SnowballStopFilter for suppressing &quot;stopwords&quot; like &quot;the&quot;, &quot;if&quot;, and &quot;maybe&quot;.</p>
+
+<pre><code>    my $stopfilter = Lucy::Analysis::SnowballStopFilter-&gt;new( 
+        language =&gt; &#39;en&#39;,
+    );
+    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        analyzers =&gt; [ $case_folder, $tokenizer, $stopfilter, $stemmer ], 
+    );</code></pre>
+
+<p>Also, try removing the SnowballStemmer.</p>
+
+<pre><code>    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        analyzers =&gt; [ $case_folder, $tokenizer ], 
+    );</code></pre>
+
+<p>The original choice of a stock English PolyAnalyzer probably still yields the best results for this document collection, but you get the idea: sometimes you want a different Analyzer.</p>
+
+<h2 id="When-the-best-Analyzer-is-no-Analyzer">When the best Analyzer is no Analyzer</h2>
+
+<p>Sometimes you don&#39;t want an Analyzer at all. That was true for our &quot;url&quot; field because we didn&#39;t need it to be searchable, but it&#39;s also true for certain types of searchable fields. For instance, &quot;category&quot; fields are often set up to match exactly or not at all, as are fields like &quot;last_name&quot; (because you may not want to conflate results for &quot;Humphrey&quot; and &quot;Humphries&quot;).</p>
+
+<p>To specify that there should be no analysis performed at all, use StringType:</p>
+
+<pre><code>    my $type = Lucy::Plan::StringType-&gt;new;
+    $schema-&gt;spec_field( name =&gt; &#39;category&#39;, type =&gt; $type );</code></pre>
+
+<h2 id="Highlighting-up-next">Highlighting up next</h2>
+
+<p>In our next tutorial chapter, <a href="../../../Lucy/Docs/Tutorial/Highlighter.html">Lucy::Docs::Tutorial::Highlighter</a>, we&#39;ll add highlighted excerpts from the &quot;content&quot; field to our search results.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/BeyondSimple.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/BeyondSimple.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/BeyondSimple.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,124 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::BeyondSimple - A more flexible app structure.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<h2 id="Goal">Goal</h2>
+
+<p>In this tutorial chapter, we&#39;ll refactor the apps we built in <a href="../../../Lucy/Docs/Tutorial/Simple.html">Lucy::Docs::Tutorial::Simple</a> so that they look exactly the same from the end user&#39;s point of view, but offer the developer greater possibilites for expansion.</p>
+
+<p>To achieve this, we&#39;ll ditch Lucy::Simple and replace it with the classes that it uses internally:</p>
+
+<ul>
+
+<li><p><a href="../../../Lucy/Plan/Schema.html">Lucy::Plan::Schema</a> - Plan out your index.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Plan/FullTextType.html">Lucy::Plan::FullTextType</a> - Field type for full text search.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Analysis/PolyAnalyzer.html">Lucy::Analysis::PolyAnalyzer</a> - A one-size-fits-all parser/tokenizer.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Index/Indexer.html">Lucy::Index::Indexer</a> - Manipulate index content.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/IndexSearcher.html">Lucy::Search::IndexSearcher</a> - Search an index.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/Hits.html">Lucy::Search::Hits</a> - Iterate over hits returned by a Searcher.</p>
+
+</li>
+</ul>
+
+<h2 id="Adaptations-to-indexer.pl">Adaptations to indexer.pl</h2>
+
+<p>After we load our modules...</p>
+
+<pre><code>    use Lucy::Plan::Schema;
+    use Lucy::Plan::FullTextType;
+    use Lucy::Analysis::PolyAnalyzer;
+    use Lucy::Index::Indexer;</code></pre>
+
+<p>... the first item we&#39;re going need is a <a href="../../../Lucy/Plan/Schema.html">Schema</a>.</p>
+
+<p>The primary job of a Schema is to specify what fields are available and how they&#39;re defined. We&#39;ll start off with three fields: title, content and url.</p>
+
+<pre><code>    # Create Schema.
+    my $schema = Lucy::Plan::Schema-&gt;new;
+    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        language =&gt; &#39;en&#39;,
+    );
+    my $type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $polyanalyzer,
+    );
+    $schema-&gt;spec_field( name =&gt; &#39;title&#39;,   type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;content&#39;, type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;url&#39;,     type =&gt; $type );</code></pre>
+
+<p>All of the fields are spec&#39;d out using the &quot;FullTextType&quot; FieldType, indicating that they will be searchable as &quot;full text&quot; -- which means that they can be searched for individual words. The &quot;analyzer&quot;, which is unique to FullTextType fields, is what breaks up the text into searchable tokens.</p>
+
+<p>Next, we&#39;ll swap our Lucy::Simple object out for a Lucy::Index::Indexer. The substitution will be straightforward because Simple has merely been serving as a thin wrapper around an inner Indexer, and we&#39;ll just be peeling away the wrapper.</p>
+
+<p>First, replace the constructor:</p>
+
+<pre><code>    # Create Indexer.
+    my $indexer = Lucy::Index::Indexer-&gt;new(
+        index    =&gt; $path_to_index,
+        schema   =&gt; $schema,
+        create   =&gt; 1,
+        truncate =&gt; 1,
+    );</code></pre>
+
+<p>Next, have the <code>$indexer</code> object <code>add_doc</code> where we were having the <code>$lucy</code> object <code>add_doc</code> before:</p>
+
+<pre><code>    foreach my $filename (@filenames) {
+        my $doc = slurp_and_parse_file($filename);
+        $indexer-&gt;add_doc($doc);
+    }</code></pre>
+
+<p>There&#39;s only one extra step required: at the end of the app, you must call commit() explicitly to close the indexing session and commit your changes. (Lucy::Simple hides this detail, calling commit() implicitly when it needs to).</p>
+
+<pre><code>    $indexer-&gt;commit;</code></pre>
+
+<h2 id="Adaptations-to-search.cgi">Adaptations to search.cgi</h2>
+
+<p>In our search app as in our indexing app, Lucy::Simple has served as a thin wrapper -- this time around <a href="../../../Lucy/Search/IndexSearcher.html">Lucy::Search::IndexSearcher</a> and <a href="../../../Lucy/Search/Hits.html">Lucy::Search::Hits</a>. Swapping out Simple for these two classes is also straightforward:</p>
+
+<pre><code>    use Lucy::Search::IndexSearcher;
+    
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( 
+        index =&gt; $path_to_index,
+    );
+    my $hits = $searcher-&gt;hits(    # returns a Hits object, not a hit count
+        query      =&gt; $q,
+        offset     =&gt; $offset,
+        num_wanted =&gt; $page_size,
+    );
+    my $hit_count = $hits-&gt;total_hits;  # get the hit count here
+    
+    ...
+    
+    while ( my $hit = $hits-&gt;next ) {
+        ...
+    }</code></pre>
+
+<h2 id="Hooray-">Hooray!</h2>
+
+<p>Congratulations! Your apps do the same thing as before... but now they&#39;ll be easier to customize.</p>
+
+<p>In our next chapter, <a href="../../../Lucy/Docs/Tutorial/FieldType.html">Lucy::Docs::Tutorial::FieldType</a>, we&#39;ll explore how to assign different behaviors to different fields.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/FieldType.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/FieldType.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/FieldType.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,57 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::FieldType - Specify per-field properties and behaviors.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>The Schema we used in the last chapter specifies three fields:</p>
+
+<pre><code>    my $type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $polyanalyzer,
+    );
+    $schema-&gt;spec_field( name =&gt; &#39;title&#39;,   type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;content&#39;, type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;url&#39;,     type =&gt; $type );</code></pre>
+
+<p>Since they are all defined as &quot;full text&quot; fields, they are all searchable -- including the <code>url</code> field, a dubious choice. Some URLs contain meaningful information, but these don&#39;t, really:</p>
+
+<pre><code>    http://example.com/us_constitution/amend1.txt</code></pre>
+
+<p>We may as well not bother indexing the URL content. To achieve that we need to assign the <code>url</code> field to a different FieldType.</p>
+
+<h2 id="StringType">StringType</h2>
+
+<p>Instead of FullTextType, we&#39;ll use a <a href="../../../Lucy/Plan/StringType.html">StringType</a>, which doesn&#39;t use an Analyzer to break up text into individual fields. Furthermore, we&#39;ll mark this StringType as unindexed, so that its content won&#39;t be searchable at all.</p>
+
+<pre><code>    my $url_type = Lucy::Plan::StringType( indexed =&gt; 0 );
+    $schema-&gt;spec_field( name =&gt; &#39;url&#39;, type =&gt; $url_type );</code></pre>
+
+<p>To observe the change in behavior, try searching for <code>us_constitution</code> both before and after changing the Schema and re-indexing.</p>
+
+<h2 id="Toggling-stored">Toggling &#39;stored&#39;</h2>
+
+<p>For a taste of other FieldType possibilities, try turning off <code>stored</code> for one or more fields.</p>
+
+<pre><code>    my $content_type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $polyanalyzer,
+        stored   =&gt; 0,
+    );</code></pre>
+
+<p>Turning off <code>stored</code> for either <code>title</code> or <code>url</code> mangles our results page, but since we&#39;re not displaying <code>content</code>, turning it off for <code>content</code> has no effect -- except on index size.</p>
+
+<h2 id="Analyzers-up-next">Analyzers up next</h2>
+
+<p>Analyzers play a crucial role in the behavior of FullTextType fields. In our next tutorial chapter, <a href="../../../Lucy/Docs/Tutorial/Analysis.html">Lucy::Docs::Tutorial::Analysis</a>, we&#39;ll see how changing up the Analyzer changes search results.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Highlighter.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Highlighter.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Highlighter.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,63 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::Highlighter - Augment search results with highlighted excerpts.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Adding relevant excerpts with highlighted search terms to your search results display makes it much easier for end users to scan the page and assess which hits look promising, dramatically improving their search experience.</p>
+
+<h2 id="Adaptations-to-indexer.pl">Adaptations to indexer.pl</h2>
+
+<p><a href="../../../Lucy/Highlight/Highlighter.html">Lucy::Highlight::Highlighter</a> uses information generated at index time. To save resources, highlighting is disabled by default and must be turned on for individual fields.</p>
+
+<pre><code>    my $highlightable = Lucy::Plan::FullTextType-&gt;new(
+        analyzer      =&gt; $polyanalyzer,
+        highlightable =&gt; 1,
+    );
+    $schema-&gt;spec_field( name =&gt; &#39;content&#39;, type =&gt; $highlightable );</code></pre>
+
+<h2 id="Adaptations-to-search.cgi">Adaptations to search.cgi</h2>
+
+<p>To add highlighting and excerpting to the search.cgi sample app, create a <code>$highlighter</code> object outside the hits iterating loop...</p>
+
+<pre><code>    my $highlighter = Lucy::Highlight::Highlighter-&gt;new(
+        searcher =&gt; $searcher,
+        query    =&gt; $q,
+        field    =&gt; &#39;content&#39;
+    );</code></pre>
+
+<p>... then modify the loop and the per-hit display to generate and include the excerpt.</p>
+
+<pre><code>    # Create result list.
+    my $report = &#39;&#39;;
+    while ( my $hit = $hits-&gt;next ) {
+        my $score   = sprintf( &quot;%0.3f&quot;, $hit-&gt;get_score );
+        my $excerpt = $highlighter-&gt;create_excerpt($hit);
+        $report .= qq|
+            &lt;p&gt;
+              &lt;a href=&quot;$hit-&gt;{url}&quot;&gt;&lt;strong&gt;$hit-&gt;{title}&lt;/strong&gt;&lt;/a&gt;
+              &lt;em&gt;$score&lt;/em&gt;
+              &lt;br /&gt;
+              $excerpt
+              &lt;br /&gt;
+              &lt;span class=&quot;excerptURL&quot;&gt;$hit-&gt;{url}&lt;/span&gt;
+            &lt;/p&gt;
+        |;
+    }</code></pre>
+
+<h2 id="Next-chapter:-Query-objects">Next chapter: Query objects</h2>
+
+<p>Our next tutorial chapter, <a href="../../../Lucy/Docs/Tutorial/QueryObjects.html">Lucy::Docs::Tutorial::QueryObjects</a>, illustrates how to build an &quot;advanced search&quot; interface using <a href="../../../Lucy/Search/Query.html">Query</a> objects instead of query strings.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/QueryObjects.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/QueryObjects.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/QueryObjects.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,130 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::QueryObjects - Use Query objects instead of query strings.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Until now, our search app has had only a single search box. In this tutorial chapter, we&#39;ll move towards an &quot;advanced search&quot; interface, by adding a &quot;category&quot; drop-down menu. Three new classes will be required:</p>
+
+<ul>
+
+<li><p><a href="../../../Lucy/Search/QueryParser.html">QueryParser</a> - Turn a query string into a <a href="../../../Lucy/Search/Query.html">Query</a> object.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/TermQuery.html">TermQuery</a> - Query for a specific term within a specific field.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/ANDQuery.html">ANDQuery</a> - &quot;AND&quot; together multiple Query objects to produce an intersected result set.</p>
+
+</li>
+</ul>
+
+<h2 id="Adaptations-to-indexer.pl">Adaptations to indexer.pl</h2>
+
+<p>Our new &quot;category&quot; field will be a StringType field rather than a FullTextType field, because we will only be looking for exact matches. It needs to be indexed, but since we won&#39;t display its value, it doesn&#39;t need to be stored.</p>
+
+<pre><code>    my $cat_type = Lucy::Plan::StringType-&gt;new( stored =&gt; 0 );
+    $schema-&gt;spec_field( name =&gt; &#39;category&#39;, type =&gt; $cat_type );</code></pre>
+
+<p>There will be three possible values: &quot;article&quot;, &quot;amendment&quot;, and &quot;preamble&quot;, which we&#39;ll hack out of the source file&#39;s name during our <code>parse_file</code> subroutine:</p>
+
+<pre><code>    my $category
+        = $filename =~ /art/      ? &#39;article&#39;
+        : $filename =~ /amend/    ? &#39;amendment&#39;
+        : $filename =~ /preamble/ ? &#39;preamble&#39;
+        :                           die &quot;Can&#39;t derive category for $filename&quot;;
+    return {
+        title    =&gt; $title,
+        content  =&gt; $bodytext,
+        url      =&gt; &quot;/us_constitution/$filename&quot;,
+        category =&gt; $category,
+    };</code></pre>
+
+<h2 id="Adaptations-to-search.cgi">Adaptations to search.cgi</h2>
+
+<p>The &quot;category&quot; constraint will be added to our search interface using an HTML &quot;select&quot; element:</p>
+
+<pre><code>    # Build up the HTML &quot;select&quot; object for the &quot;category&quot; field.
+    sub generate_category_select {
+        my $cat = shift;
+        my $select = qq|
+          &lt;select name=&quot;category&quot;&gt;
+            &lt;option value=&quot;&quot;&gt;All Sections&lt;/option&gt;
+            &lt;option value=&quot;article&quot;&gt;Articles&lt;/option&gt;
+            &lt;option value=&quot;amendment&quot;&gt;Amendments&lt;/option&gt;
+          &lt;/select&gt;|;
+        if ($cat) {
+            $select =~ s/&quot;$cat&quot;/&quot;$cat&quot; selected/;
+        }
+        return $select;
+    }</code></pre>
+
+<p>We&#39;ll start off by loading our new modules and extracting our new CGI parameter.</p>
+
+<pre><code>    use Lucy::Search::QueryParser;
+    use Lucy::Search::TermQuery;
+    use Lucy::Search::ANDQuery;
+    
+    ... 
+    
+    my $category = decode( &quot;UTF-8&quot;, $cgi-&gt;param(&#39;category&#39;) || &#39;&#39; );</code></pre>
+
+<p>QueryParser&#39;s constructor requires a &quot;schema&quot; argument. We can get that from our IndexSearcher:</p>
+
+<pre><code>    # Create an IndexSearcher and a QueryParser.
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( 
+        index =&gt; $path_to_index, 
+    );
+    my $qparser  = Lucy::Search::QueryParser-&gt;new( 
+        schema =&gt; $searcher-&gt;get_schema,
+    );</code></pre>
+
+<p>Previously, we have been handing raw query strings to IndexSearcher. Behind the scenes, IndexSearcher has been using a QueryParser to turn those query strings into Query objects. Now, we will bring QueryParser into the foreground and parse the strings explicitly.</p>
+
+<pre><code>    my $query = $qparser-&gt;parse($q);</code></pre>
+
+<p>If the user has specified a category, we&#39;ll use an ANDQuery to join our parsed query together with a TermQuery representing the category.</p>
+
+<pre><code>    if ($category) {
+        my $category_query = Lucy::Search::TermQuery-&gt;new(
+            field =&gt; &#39;category&#39;, 
+            term  =&gt; $category,
+        );
+        $query = Lucy::Search::ANDQuery-&gt;new(
+            children =&gt; [ $query, $category_query ]
+        );
+    }</code></pre>
+
+<p>Now when we execute the query...</p>
+
+<pre><code>    # Execute the Query and get a Hits object.
+    my $hits = $searcher-&gt;hits(
+        query      =&gt; $query,
+        offset     =&gt; $offset,
+        num_wanted =&gt; $page_size,
+    );</code></pre>
+
+<p>... we&#39;ll get a result set which is the intersection of the parsed query and the category query.</p>
+
+<h1 id="Congratulations-">Congratulations!</h1>
+
+<p>You&#39;ve made it to the end of the tutorial.</p>
+
+<h1 id="SEE-ALSO">SEE ALSO</h1>
+
+<p>For additional thematic documentation, see the Apache Lucy <a href="../../../Lucy/Docs/Cookbook.html">Cookbook</a>.</p>
+
+<p>ANDQuery has a companion class, <a href="../../../Lucy/Search/ORQuery.html">ORQuery</a>, and a close relative, <a href="../../../Lucy/Search/RequiredOptionalQuery.html">RequiredOptionalQuery</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Simple.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Simple.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Docs/Tutorial/Simple.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,291 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::Simple - Bare-bones search app.</p>
+
+<h2 id="Setup">Setup</h2>
+
+<p>Copy the text presentation of the US Constitution from the <code>sample</code> directory of the Apache Lucy distribution to the base level of your web server&#39;s <code>htdocs</code> directory.</p>
+
+<pre><code>    $ cp -R sample/us_constitution /usr/local/apache2/htdocs/</code></pre>
+
+<h2 id="Indexing:-indexer.pl">Indexing: indexer.pl</h2>
+
+<p>Our first task will be to create an application called <code>indexer.pl</code> which builds a searchable &quot;inverted index&quot; from a collection of documents.</p>
+
+<p>After we specify some configuration variables and load all necessary modules...</p>
+
+<pre><code>    #!/usr/local/bin/perl
+    use strict;
+    use warnings;
+    
+    # (Change configuration variables as needed.)
+    my $path_to_index = &#39;/path/to/index&#39;;
+    my $uscon_source  = &#39;/usr/local/apache2/htdocs/us_constitution&#39;;
+
+    use Lucy::Simple;
+    use File::Spec::Functions qw( catfile );</code></pre>
+
+<p>... we&#39;ll start by creating a Lucy::Simple object, telling it where we&#39;d like the index to be located and the language of the source material.</p>
+
+<pre><code>    my $lucy = Lucy::Simple-&gt;new(
+        path     =&gt; $path_to_index,
+        language =&gt; &#39;en&#39;,
+    );</code></pre>
+
+<p>Next, we&#39;ll add a subroutine which parses our sample documents.</p>
+
+<pre><code>    # Parse a file from our US Constitution collection and return a hashref with
+    # the fields title, body, and url.
+    sub parse_file {
+        my $filename = shift;
+        my $filepath = catfile( $uscon_source, $filename );
+        open( my $fh, &#39;&lt;&#39;, $filepath ) or die &quot;Can&#39;t open &#39;$filepath&#39;: $!&quot;;
+        my $text = do { local $/; &lt;$fh&gt; };    # slurp file content
+        $text =~ /\A(.+?)^\s+(.*)/ms
+            or die &quot;Can&#39;t extract title/bodytext from &#39;$filepath&#39;&quot;;
+        my $title    = $1;
+        my $bodytext = $2;
+        return {
+            title    =&gt; $title,
+            content  =&gt; $bodytext,
+            url      =&gt; &quot;/us_constitution/$filename&quot;,
+            category =&gt; $category,
+        };
+    }</code></pre>
+
+<p>Add some elementary directory reading code...</p>
+
+<pre><code>    # Collect names of source files.
+    opendir( my $dh, $uscon_source )
+        or die &quot;Couldn&#39;t opendir &#39;$uscon_source&#39;: $!&quot;;
+    my @filenames = grep { $_ =~ /\.txt/ } readdir $dh;</code></pre>
+
+<p>... and now we&#39;re ready for the meat of indexer.pl -- which occupies exactly one line of code.</p>
+
+<pre><code>    foreach my $filename (@filenames) {
+        my $doc = parse_file($filename);
+        $lucy-&gt;add_doc($doc);  # ta-da!
+    }</code></pre>
+
+<h2 id="Search:-search.cgi">Search: search.cgi</h2>
+
+<p>As with our indexing app, the bulk of the code in our search script won&#39;t be Lucy-specific.</p>
+
+<p>The beginning is dedicated to CGI processing and configuration.</p>
+
+<pre><code>    #!/usr/local/bin/perl -T
+    use strict;
+    use warnings;
+    
+    # (Change configuration variables as needed.)
+    my $path_to_index = &#39;/path/to/index&#39;;
+
+    use CGI;
+    use List::Util qw( max min );
+    use POSIX qw( ceil );
+    use Encode qw( decode );
+    use Lucy::Simple;
+    
+    my $cgi       = CGI-&gt;new;
+    my $q         = decode( &quot;UTF-8&quot;, $cgi-&gt;param(&#39;q&#39;) || &#39;&#39; );
+    my $offset    = decode( &quot;UTF-8&quot;, $cgi-&gt;param(&#39;offset&#39;) || 0 );
+    my $page_size = 10;</code></pre>
+
+<p>Once that&#39;s out of the way, we create our Lucy::Simple object and feed it a query string.</p>
+
+<pre><code>    my $lucy = Lucy::Simple-&gt;new(
+        path     =&gt; $path_to_index,
+        language =&gt; &#39;en&#39;,
+    );
+    my $hit_count = $lucy-&gt;search(
+        query      =&gt; $q,
+        offset     =&gt; $offset,
+        num_wanted =&gt; $page_size,
+    );</code></pre>
+
+<p>The value returned by search() is the total number of documents in the collection which matched the query. We&#39;ll show this hit count to the user, and also use it in conjunction with the parameters <code>offset</code> and <code>num_wanted</code> to break up results into &quot;pages&quot; of manageable size.</p>
+
+<p>Calling search() on our Simple object turns it into an iterator. Invoking next() now returns hits one at a time as <a href="../../../Lucy/Document/HitDoc.html">Lucy::Document::HitDoc</a> objects, starting with the most relevant.</p>
+
+<pre><code>    # Create result list.
+    my $report = &#39;&#39;;
+    while ( my $hit = $lucy-&gt;next ) {
+        my $score = sprintf( &quot;%0.3f&quot;, $hit-&gt;get_score );
+        $report .= qq|
+            &lt;p&gt;
+              &lt;a href=&quot;$hit-&gt;{url}&quot;&gt;&lt;strong&gt;$hit-&gt;{title}&lt;/strong&gt;&lt;/a&gt;
+              &lt;em&gt;$score&lt;/em&gt;
+              &lt;br&gt;
+              &lt;span class=&quot;excerptURL&quot;&gt;$hit-&gt;{url}&lt;/span&gt;
+            &lt;/p&gt;
+            |;
+    }</code></pre>
+
+<p>The rest of the script is just text wrangling.</p>
+
+<pre><code>    #---------------------------------------------------------------#
+    # No tutorial material below this point - just html generation. #
+    #---------------------------------------------------------------#
+    
+    # Generate paging links and hit count, print and exit.
+    my $paging_links = generate_paging_info( $q, $hit_count );
+    blast_out_content( $q, $report, $paging_links );
+    
+    # Create html fragment with links for paging through results n-at-a-time.
+    sub generate_paging_info {
+        my ( $query_string, $total_hits ) = @_;
+        my $escaped_q = CGI::escapeHTML($query_string);
+        my $paging_info;
+        if ( !length $query_string ) {
+            # No query?  No display.
+            $paging_info = &#39;&#39;;
+        }
+        elsif ( $total_hits == 0 ) {
+            # Alert the user that their search failed.
+            $paging_info
+                = qq|&lt;p&gt;No matches for &lt;strong&gt;$escaped_q&lt;/strong&gt;&lt;/p&gt;|;
+        }
+        else {
+            # Calculate the nums for the first and last hit to display.
+            my $last_result = min( ( $offset + $page_size ), $total_hits );
+            my $first_result = min( ( $offset + 1 ), $last_result );
+
+            # Display the result nums, start paging info.
+            $paging_info = qq|
+                &lt;p&gt;
+                    Results &lt;strong&gt;$first_result-$last_result&lt;/strong&gt; 
+                    of &lt;strong&gt;$total_hits&lt;/strong&gt; 
+                    for &lt;strong&gt;$escaped_q&lt;/strong&gt;.
+                &lt;/p&gt;
+                &lt;p&gt;
+                    Results Page:
+                |;
+
+            # Calculate first and last hits pages to display / link to.
+            my $current_page = int( $first_result / $page_size ) + 1;
+            my $last_page    = ceil( $total_hits / $page_size );
+            my $first_page   = max( 1, ( $current_page - 9 ) );
+            $last_page = min( $last_page, ( $current_page + 10 ) );
+
+            # Create a url for use in paging links.
+            my $href = $cgi-&gt;url( -relative =&gt; 1 );
+            $href .= &quot;?q=&quot; . CGI::escape($query_string);
+            $href .= &quot;;category=&quot; . CGI::escape($category);
+            $href .= &quot;;offset=&quot; . CGI::escape($offset);
+
+            # Generate the &quot;Prev&quot; link.
+            if ( $current_page &gt; 1 ) {
+                my $new_offset = ( $current_page - 2 ) * $page_size;
+                $href =~ s/(?&lt;=offset=)\d+/$new_offset/;
+                $paging_info .= qq|&lt;a href=&quot;$href&quot;&gt;&amp;lt;= Prev&lt;/a&gt;\n|;
+            }
+
+            # Generate paging links.
+            for my $page_num ( $first_page .. $last_page ) {
+                if ( $page_num == $current_page ) {
+                    $paging_info .= qq|$page_num \n|;
+                }
+                else {
+                    my $new_offset = ( $page_num - 1 ) * $page_size;
+                    $href =~ s/(?&lt;=offset=)\d+/$new_offset/;
+                    $paging_info .= qq|&lt;a href=&quot;$href&quot;&gt;$page_num&lt;/a&gt;\n|;
+                }
+            }
+
+            # Generate the &quot;Next&quot; link.
+            if ( $current_page != $last_page ) {
+                my $new_offset = $current_page * $page_size;
+                $href =~ s/(?&lt;=offset=)\d+/$new_offset/;
+                $paging_info .= qq|&lt;a href=&quot;$href&quot;&gt;Next =&amp;gt;&lt;/a&gt;\n|;
+            }
+
+            # Close tag.
+            $paging_info .= &quot;&lt;/p&gt;\n&quot;;
+        }
+
+        return $paging_info;
+    }
+
+    # Print content to output.
+    sub blast_out_content {
+        my ( $query_string, $hit_list, $paging_info ) = @_;
+        my $escaped_q = CGI::escapeHTML($query_string);
+        binmode( STDOUT, &quot;:encoding(UTF-8)&quot; );
+        print qq|Content-type: text/html; charset=UTF-8\n\n|;
+        print qq|
+    &lt;!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01 Transitional//EN&quot;
+        &quot;http://www.w3.org/TR/html4/loose.dtd&quot;&gt;
+    &lt;html&gt;
+    &lt;head&gt;
+      &lt;meta http-equiv=&quot;Content-type&quot; 
+        content=&quot;text/html;charset=UTF-8&quot;&gt;
+      &lt;link rel=&quot;stylesheet&quot; type=&quot;text/css&quot; 
+        href=&quot;/us_constitution/uscon.css&quot;&gt;
+      &lt;title&gt;Lucy: $escaped_q&lt;/title&gt;
+    &lt;/head&gt;
+    
+    &lt;body&gt;
+    
+      &lt;div id=&quot;navigation&quot;&gt;
+        &lt;form id=&quot;usconSearch&quot; action=&quot;&quot;&gt;
+          &lt;strong&gt;
+            Search the 
+            &lt;a href=&quot;/us_constitution/index.html&quot;&gt;US Constitution&lt;/a&gt;:
+          &lt;/strong&gt;
+          &lt;input type=&quot;text&quot; name=&quot;q&quot; id=&quot;q&quot; value=&quot;$escaped_q&quot;&gt;
+          &lt;input type=&quot;submit&quot; value=&quot;=&amp;gt;&quot;&gt;
+        &lt;/form&gt;
+      &lt;/div&gt;&lt;!--navigation--&gt;
+    
+      &lt;div id=&quot;bodytext&quot;&gt;
+    
+      $hit_list
+    
+      $paging_info
+    
+        &lt;p style=&quot;font-size: smaller; color: #666&quot;&gt;
+          &lt;em&gt;
+            Powered by &lt;a href=&quot;http://incubator.apache.org/lucy/&quot;
+            &gt;Apache Lucy&lt;small&gt;&lt;sup&gt;T/sup&lt;/small&gt;&lt;/a&gt;
+          &lt;/em&gt;
+        &lt;/p&gt;
+      &lt;/div&gt;&lt;!--bodytext--&gt;
+    
+    &lt;/body&gt;
+    
+    &lt;/html&gt;
+    |;
+    }</code></pre>
+
+<h2 id="OK...-now-what-">OK... now what?</h2>
+
+<p>Lucy::Simple is perfectly adequate for some tasks, but it&#39;s not very flexible. Many people find that it doesn&#39;t do at least one or two things they can&#39;t live without.</p>
+
+<p>In our next tutorial chapter, <a href="../../../Lucy/Docs/Tutorial/BeyondSimple.html">BeyondSimple</a>, we&#39;ll rewrite our indexing and search scripts using the classes that Lucy::Simple hides from view, opening up the possibilities for expansion; then, we&#39;ll spend the rest of the tutorial chapters exploring these possibilities.</p>
+
+<h1 id="POD-ERRORS">POD ERRORS</h1>
+
+<p>Hey! <b>The above document had some coding errors, which are explained below:</b></p>
+
+<dl>
+
+<dt>Around line 154:</dt>
+<dd>
+
+<p>Deleting unknown formatting code M&lt;&gt;</p>
+
+</dd>
+</dl>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Document/Doc.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Document/Doc.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Document/Doc.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,68 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Document::Doc - A document.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $doc = Lucy::Document::Doc-&gt;new(
+        fields =&gt; { foo =&gt; &#39;foo foo&#39;, bar =&gt; &#39;bar bar&#39; },
+    );
+    $indexer-&gt;add_doc($doc);</code></pre>
+
+<p>Doc objects allow access to field values via hashref overloading:</p>
+
+<pre><code>    $doc-&gt;{foo} = &#39;new value for field &quot;foo&quot;&#39;;
+    print &quot;foo: $doc-&gt;{foo}\n&quot;;</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>A Doc object is akin to a row in a database, in that it is made up of one or more fields, each of which has a value.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $doc = Lucy::Document::Doc-&gt;new(
+        fields =&gt; { foo =&gt; &#39;foo foo&#39;, bar =&gt; &#39;bar bar&#39; },
+    );</code></pre>
+
+<ul>
+
+<li><p><b>fields</b> - Field-value pairs.</p>
+
+</li>
+<li><p><b>doc_id</b> - Internal Lucy document id. Default of 0 (an invalid doc id).</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="set_doc_id-doc_id-">set_doc_id(doc_id)</h2>
+
+<p>Set internal Lucy document id.</p>
+
+<h2 id="get_doc_id-">get_doc_id()</h2>
+
+<p>Retrieve internal Lucy document id.</p>
+
+<h2 id="get_fields-">get_fields()</h2>
+
+<p>Return the Doc&#39;s backing fields hash.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Document::Doc isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Document/HitDoc.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Document/HitDoc.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Document/HitDoc.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,42 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Document::HitDoc - A document read from an index.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    while ( my $hit_doc = $hits-&gt;next ) {
+        print &quot;$hit_doc-&gt;{title}\n&quot;;
+        print $hit_doc-&gt;get_score . &quot;\n&quot;;
+        ...
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>HitDoc is the search-time relative of the index-time class Doc; it is augmented by a numeric score attribute that Doc doesn&#39;t have.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="set_score-score-">set_score(score)</h2>
+
+<p>Set score attribute.</p>
+
+<h2 id="get_score-">get_score()</h2>
+
+<p>Get score attribute.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Document::HitDoc isa <a href="../../Lucy/Document/Doc.html">Lucy::Document::Doc</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Highlight/Highlighter.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Highlight/Highlighter.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Highlight/Highlighter.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,114 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Highlight::Highlighter - Create and highlight excerpts.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $highlighter = Lucy::Highlight::Highlighter-&gt;new(
+        searcher =&gt; $searcher,
+        query    =&gt; $query,
+        field    =&gt; &#39;body&#39;
+    );
+    my $hits = $searcher-&gt;hits( query =&gt; $query );
+    while ( my $hit = $hits-&gt;next ) {
+        my $excerpt = $highlighter-&gt;create_excerpt($hit);
+        ...
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>The Highlighter can be used to select relevant snippets from a document, and to surround search terms with highlighting tags. It handles both stems and phrases correctly and efficiently, using special-purpose data generated at index-time.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $highlighter = Lucy::Highlight::Highlighter-&gt;new(
+        searcher       =&gt; $searcher,    # required
+        query          =&gt; $query,       # required
+        field          =&gt; &#39;content&#39;,    # required
+        excerpt_length =&gt; 150,          # default: 200
+    );</code></pre>
+
+<ul>
+
+<li><p><b>searcher</b> - An object which inherits from <a href="../../Lucy/Search/Searcher.html">Searcher</a>, such as an <a href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a>.</p>
+
+</li>
+<li><p><b>query</b> - Query object or a query string.</p>
+
+</li>
+<li><p><b>field</b> - The name of the field from which to draw the excerpt. The field must marked as be <code>highlightable</code> (see <a href="../../Lucy/Plan/FieldType.html">FieldType</a>).</p>
+
+</li>
+<li><p><b>excerpt_length</b> - Maximum length of the excerpt, in characters.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="create_excerpt-hit_doc-">create_excerpt(hit_doc)</h2>
+
+<p>Take a HitDoc object and return a highlighted excerpt as a string if the HitDoc has a value for the specified <code>field</code>.</p>
+
+<h2 id="highlight-text-">highlight(text)</h2>
+
+<p>Highlight a small section of text. By default, prepends pre-tag and appends post-tag. This method is called internally by create_excerpt() when assembling an excerpt.</p>
+
+<h2 id="encode-text-">encode(text)</h2>
+
+<p>Encode text with HTML entities. This method is called internally by create_excerpt() for each text fragment when assembling an excerpt. A subclass can override this if the text should be encoded differently or not at all.</p>
+
+<h2 id="set_pre_tag-pre_tag-">set_pre_tag(pre_tag)</h2>
+
+<p>Setter. The default value is &quot;&lt;strong&gt;&quot;.</p>
+
+<h2 id="get_pre_tag-">get_pre_tag()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="set_post_tag-post_tag-">set_post_tag(post_tag)</h2>
+
+<p>Setter. The default value is &quot;&lt;/strong&gt;&quot;.</p>
+
+<h2 id="get_post_tag-">get_post_tag()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_searcher-">get_searcher()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_query-">get_query()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_compiler-">get_compiler()</h2>
+
+<p>Accessor for the Lucy::Search::Compiler object derived from <code>query</code> and <code>searcher</code>.</p>
+
+<h2 id="get_excerpt_length-">get_excerpt_length()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_field-">get_field()</h2>
+
+<p>Accessor.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Highlight::Highlighter isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/BackgroundMerger.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/BackgroundMerger.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/BackgroundMerger.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,72 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::BackgroundMerger - Consolidate index segments in the background.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $bg_merger = Lucy::Index::BackgroundMerger-&gt;new(
+        index  =&gt; &#39;/path/to/index&#39;,
+    );
+    $bg_merger-&gt;commit;</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Adding documents to an index is usually fast, but every once in a while the index must be compacted and an update takes substantially longer to complete. See <a href="../../Lucy/Docs/Cookbook/FastUpdates.html">Lucy::Docs::Cookbook::FastUpdates</a> for how to use this class to control worst-case index update performance.</p>
+
+<p>As with <a href="../../Lucy/Index/Indexer.html">Indexer</a>, see <a href="../../Lucy/Docs/FileLocking.html">Lucy::Docs::FileLocking</a> if your index is on a shared volume.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $bg_merger = Lucy::Index::BackgroundMerger-&gt;new(
+        index   =&gt; &#39;/path/to/index&#39;,    # required
+        manager =&gt; $manager             # default: created internally
+    );</code></pre>
+
+<p>Open a new BackgroundMerger.</p>
+
+<ul>
+
+<li><p><b>index</b> - Either a string filepath or a Folder.</p>
+
+</li>
+<li><p><b>manager</b> - An IndexManager. If not supplied, an IndexManager with a 10-second write lock timeout will be created.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="commit-">commit()</h2>
+
+<p>Commit any changes made to the index. Until this is called, none of the changes made during an indexing session are permanent.</p>
+
+<p>Calls prepare_commit() implicitly if it has not already been called.</p>
+
+<h2 id="prepare_commit-">prepare_commit()</h2>
+
+<p>Perform the expensive setup for commit() in advance, so that commit() completes quickly.</p>
+
+<p>Towards the end of prepare_commit(), the BackgroundMerger attempts to re-acquire the write lock, which is then held until commit() finishes and releases it.</p>
+
+<h2 id="optimize-">optimize()</h2>
+
+<p>Optimize the index for search-time performance. This may take a while, as it can involve rewriting large amounts of data.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::BackgroundMerger isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DataReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DataReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DataReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,101 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DataReader - Abstract base class for reading index data.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    # Abstract base class.</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>DataReader is the companion class to <a href="../../Lucy/Index/DataWriter.html">DataWriter</a>. Every index component will implement one of each.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $reader = MyDataReader-&gt;new(
+        schema   =&gt; $seg_reader-&gt;get_schema,      # default undef
+        folder   =&gt; $seg_reader-&gt;get_folder,      # default undef
+        snapshot =&gt; $seg_reader-&gt;get_snapshot,    # default undef
+        segments =&gt; $seg_reader-&gt;get_segments,    # default undef
+        seg_tick =&gt; $seg_reader-&gt;get_seg_tick,    # default -1
+    );</code></pre>
+
+<ul>
+
+<li><p><b>schema</b> - A Schema.</p>
+
+</li>
+<li><p><b>folder</b> - A Folder.</p>
+
+</li>
+<li><p><b>snapshot</b> - A Snapshot.</p>
+
+</li>
+<li><p><b>segments</b> - An array of Segments.</p>
+
+</li>
+<li><p><b>seg_tick</b> - The array index of the Segment object within the <code>segments</code> array that this particular DataReader is assigned to, if any. A value of -1 indicates that no Segment should be assigned.</p>
+
+</li>
+</ul>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="aggregator-labeled-params-">aggregator( <i>[labeled params]</i> )</h2>
+
+<p>Create a reader which aggregates the output of several lower level readers. Return undef if such a reader is not valid.</p>
+
+<ul>
+
+<li><p><b>readers</b> - An array of DataReaders.</p>
+
+</li>
+<li><p><b>offsets</b> - Doc id start offsets for each reader.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="get_schema-">get_schema()</h2>
+
+<p>Accessor for &quot;schema&quot; member var.</p>
+
+<h2 id="get_folder-">get_folder()</h2>
+
+<p>Accessor for &quot;folder&quot; member var.</p>
+
+<h2 id="get_snapshot-">get_snapshot()</h2>
+
+<p>Accessor for &quot;snapshot&quot; member var.</p>
+
+<h2 id="get_segments-">get_segments()</h2>
+
+<p>Accessor for &quot;segments&quot; member var.</p>
+
+<h2 id="get_segment-">get_segment()</h2>
+
+<p>Accessor for &quot;segment&quot; member var.</p>
+
+<h2 id="get_seg_tick-">get_seg_tick()</h2>
+
+<p>Accessor for &quot;seg_tick&quot; member var.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DataReader isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DataWriter.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DataWriter.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DataWriter.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,144 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DataWriter - Write data to an index.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    # Abstract base class.</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>DataWriter is an abstract base class for writing index data, generally in segment-sized chunks. Each component of an index -- e.g. stored fields, lexicon, postings, deletions -- is represented by a DataWriter/<a href="../../Lucy/Index/DataReader.html">DataReader</a> pair.</p>
+
+<p>Components may be specified per index by subclassing <a href="../../Lucy/Plan/Architecture.html">Architecture</a>.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $writer = MyDataWriter-&gt;new(
+        snapshot   =&gt; $snapshot,      # required
+        segment    =&gt; $segment,       # required
+        polyreader =&gt; $polyreader,    # required
+    );</code></pre>
+
+<ul>
+
+<li><p><b>snapshot</b> - The Snapshot that will be committed at the end of the indexing session.</p>
+
+</li>
+<li><p><b>segment</b> - The Segment in progress.</p>
+
+</li>
+<li><p><b>polyreader</b> - A PolyReader representing all existing data in the index. (If the index is brand new, the PolyReader will have no sub-readers).</p>
+
+</li>
+</ul>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="add_inverted_doc-labeled-params-">add_inverted_doc( <i>[labeled params]</i> )</h2>
+
+<p>Process a document, previously inverted by <code>inverter</code>.</p>
+
+<ul>
+
+<li><p><b>inverter</b> - An Inverter wrapping an inverted document.</p>
+
+</li>
+<li><p><b>doc_id</b> - Internal number assigned to this document within the segment.</p>
+
+</li>
+</ul>
+
+<h2 id="add_segment-labeled-params-">add_segment( <i>[labeled params]</i> )</h2>
+
+<p>Add content from an existing segment into the one currently being written.</p>
+
+<ul>
+
+<li><p><b>reader</b> - The SegReader containing content to add.</p>
+
+</li>
+<li><p><b>doc_map</b> - An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.</p>
+
+</li>
+</ul>
+
+<h2 id="finish-">finish()</h2>
+
+<p>Complete the segment: close all streams, store metadata, etc.</p>
+
+<h2 id="format-">format()</h2>
+
+<p>Every writer must specify a file format revision number, which should increment each time the format changes. Responsibility for revision checking is left to the companion DataReader.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="delete_segment-reader-">delete_segment(reader)</h2>
+
+<p>Remove a segment&#39;s data. The default implementation is a no-op, as all files within the segment directory will be automatically deleted. Subclasses which manage their own files outside of the segment system should override this method and use it as a trigger for cleaning up obsolete data.</p>
+
+<ul>
+
+<li><p><b>reader</b> - The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.</p>
+
+</li>
+</ul>
+
+<h2 id="merge_segment-labeled-params-">merge_segment( <i>[labeled params]</i> )</h2>
+
+<p>Move content from an existing segment into the one currently being written.</p>
+
+<p>The default implementation calls add_segment() then delete_segment().</p>
+
+<ul>
+
+<li><p><b>reader</b> - The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.</p>
+
+</li>
+<li><p><b>doc_map</b> - An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.</p>
+
+</li>
+</ul>
+
+<h2 id="metadata-">metadata()</h2>
+
+<p>Arbitrary metadata to be serialized and stored by the Segment. The default implementation supplies a Hash with a single key-value pair for &quot;format&quot;.</p>
+
+<h2 id="get_snapshot-">get_snapshot()</h2>
+
+<p>Accessor for &quot;snapshot&quot; member var.</p>
+
+<h2 id="get_segment-">get_segment()</h2>
+
+<p>Accessor for &quot;segment&quot; member var.</p>
+
+<h2 id="get_polyreader-">get_polyreader()</h2>
+
+<p>Accessor for &quot;polyreader&quot; member var.</p>
+
+<h2 id="get_schema-">get_schema()</h2>
+
+<p>Accessor for &quot;schema&quot; member var.</p>
+
+<h2 id="get_folder-">get_folder()</h2>
+
+<p>Accessor for &quot;folder&quot; member var.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DataWriter isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DeletionsWriter.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DeletionsWriter.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DeletionsWriter.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,79 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DeletionsWriter - Abstract base class for marking documents as deleted.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $polyreader  = $del_writer-&gt;get_polyreader;
+    my $seg_readers = $polyreader-&gt;seg_readers;
+    for my $seg_reader (@$seg_readers) {
+        my $count = $del_writer-&gt;seg_del_count( $seg_reader-&gt;get_seg_name );
+        ...
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Subclasses of DeletionsWriter provide a low-level mechanism for declaring a document deleted from an index.</p>
+
+<p>Because files in an index are never modified, and because it is not practical to delete entire segments, a DeletionsWriter does not actually remove documents from the index. Instead, it communicates to a search-time companion DeletionsReader which documents are deleted in such a way that it can create a Matcher iterator.</p>
+
+<p>Documents are truly deleted only when the segments which contain them are merged into new ones.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="delete_by_term-labeled-params-">delete_by_term( <i>[labeled params]</i> )</h2>
+
+<p>Delete all documents in the index that index the supplied term.</p>
+
+<ul>
+
+<li><p><b>field</b> - The name of an indexed field. (If it is not spec&#39;d as <code>indexed</code>, an error will occur.)</p>
+
+</li>
+<li><p><b>term</b> - The term which identifies docs to be marked as deleted. If <code>field</code> is associated with an Analyzer, <code>term</code> will be processed automatically (so don&#39;t pre-process it yourself).</p>
+
+</li>
+</ul>
+
+<h2 id="delete_by_query-query-">delete_by_query(query)</h2>
+
+<p>Delete all documents in the index that match <code>query</code>.</p>
+
+<ul>
+
+<li><p><b>query</b> - A <a href="../../Lucy/Search/Query.html">Query</a>.</p>
+
+</li>
+</ul>
+
+<h2 id="updated-">updated()</h2>
+
+<p>Returns true if there are updates that need to be written.</p>
+
+<h2 id="seg_del_count-seg_name-">seg_del_count(seg_name)</h2>
+
+<p>Return the number of deletions for a given segment.</p>
+
+<ul>
+
+<li><p><b>seg_name</b> - The name of the segment.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DeletionsWriter isa <a href="../../Lucy/Index/DataWriter.html">Lucy::Index::DataWriter</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DocReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DocReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/DocReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,53 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DocReader - Retrieve stored documents.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $doc_reader = $seg_reader-&gt;obtain(&quot;Lucy::Index::DocReader&quot;);
+    my $doc        = $doc_reader-&gt;fetch_doc($doc_id);</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>DocReader defines the interface by which documents (with all stored fields) are retrieved from the index. The default implementation returns <a href="../../Lucy/Document/HitDoc.html">HitDoc</a> objects.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="fetch_doc-doc_id-">fetch_doc(doc_id)</h2>
+
+<p>Retrieve the document identified by <code>doc_id</code>.</p>
+
+<p>Returns: a HitDoc.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="aggregator-labeled-params-">aggregator( <i>[labeled params]</i> )</h2>
+
+<p>Returns a DocReader which divvies up requests to its sub-readers according to the offset range.</p>
+
+<ul>
+
+<li><p><b>readers</b> - An array of DocReaders.</p>
+
+</li>
+<li><p><b>offsets</b> - Doc id start offsets for each reader.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DocReader isa <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/IndexManager.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/IndexManager.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/IndexManager.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,119 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::IndexManager - Policies governing index updating, locking, and file deletion.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    use Sys::Hostname qw( hostname );
+    my $hostname = hostname() or die &quot;Can&#39;t get unique hostname&quot;;
+    my $manager = Lucy::Index::IndexManager-&gt;new( 
+        host =&gt; $hostname,
+    );
+
+    # Index time:
+    my $indexer = Lucy::Index::Indexer-&gt;new(
+        index =&gt; &#39;/path/to/index&#39;,
+        manager =&gt; $manager,
+    );
+
+    # Search time:
+    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index   =&gt; &#39;/path/to/index&#39;,
+        manager =&gt; $manager,
+    );
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( index =&gt; $reader );</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>IndexManager is an advanced-use class for controlling index locking, updating, merging, and deletion behaviors.</p>
+
+<p>IndexManager and <a href="../../Lucy/Plan/Architecture.html">Architecture</a> are complementary classes: Architecture is used to define traits and behaviors which cannot change for the life of an index; IndexManager is used for defining rules which may change from process to process.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $manager = Lucy::Index::IndexManager-&gt;new(
+        host =&gt; $hostname,    # default: &quot;&quot;
+    );</code></pre>
+
+<ul>
+
+<li><p><b>host</b> - An identifier which should be unique per-machine.</p>
+
+</li>
+<li><p><b>lock_factory</b> - A LockFactory.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="make_write_lock-">make_write_lock()</h2>
+
+<p>Create the Lock which controls access to modifying the logical content of the index.</p>
+
+<h2 id="recycle-labeled-params-">recycle( <i>[labeled params]</i> )</h2>
+
+<p>Return an array of SegReaders representing segments that should be consolidated. Implementations must balance index-time churn against search-time degradation due to segment proliferation. The default implementation prefers small segments or segments with a high proportion of deletions.</p>
+
+<ul>
+
+<li><p><b>reader</b> - A PolyReader.</p>
+
+</li>
+<li><p><b>del_writer</b> - A DeletionsWriter.</p>
+
+</li>
+<li><p><b>cutoff</b> - A segment number which all returned SegReaders must exceed.</p>
+
+</li>
+<li><p><b>optimize</b> - A boolean indicating whether to spend extra time optimizing the index for search-time performance.</p>
+
+</li>
+</ul>
+
+<h2 id="set_folder-folder-">set_folder(folder)</h2>
+
+<p>Setter for <code>folder</code> member. Typical clients (Indexer, IndexReader) will use this method to install their own Folder instance.</p>
+
+<h2 id="get_folder-">get_folder()</h2>
+
+<p>Getter for <code>folder</code> member.</p>
+
+<h2 id="get_host-">get_host()</h2>
+
+<p>Getter for <code>host</code> member.</p>
+
+<h2 id="set_write_lock_timeout-timeout-">set_write_lock_timeout(timeout)</h2>
+
+<p>Setter for write lock timeout. Default: 1000 milliseconds.</p>
+
+<h2 id="get_write_lock_timeout-">get_write_lock_timeout()</h2>
+
+<p>Getter for write lock timeout.</p>
+
+<h2 id="set_write_lock_interval-timeout-">set_write_lock_interval(timeout)</h2>
+
+<p>Setter for write lock retry interval. Default: 100 milliseconds.</p>
+
+<h2 id="get_write_lock_interval-">get_write_lock_interval()</h2>
+
+<p>Getter for write lock retry interval.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::IndexManager isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/IndexReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/IndexReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/IndexReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,116 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::IndexReader - Read from an inverted index.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index =&gt; &#39;/path/to/index&#39;,
+    );
+    my $seg_readers = $reader-&gt;seg_readers;
+    for my $seg_reader (@$seg_readers) {
+        my $seg_name = $seg_reader-&gt;get_segment-&gt;get_name;
+        my $num_docs = $seg_reader-&gt;doc_max;
+        print &quot;Segment $seg_name ($num_docs documents):\n&quot;;
+        my $doc_reader = $seg_reader-&gt;obtain(&quot;Lucy::Index::DocReader&quot;);
+        for my $doc_id ( 1 .. $num_docs ) {
+            my $doc = $doc_reader-&gt;fetch_doc($doc_id);
+            print &quot;  $doc_id: $doc-&gt;{title}\n&quot;;
+        }
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>IndexReader is the interface through which <a href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a> objects access the content of an index.</p>
+
+<p>IndexReader objects always represent a point-in-time view of an index as it existed at the moment the reader was created. If you want search results to reflect modifications to an index, you must create a new IndexReader after the update process completes.</p>
+
+<p>IndexReaders are composites; most of the work is done by individual <a href="../../Lucy/Index/DataReader.html">DataReader</a> sub-components, which may be accessed via fetch() and obtain(). The most efficient and powerful access to index data happens at the segment level via <a href="../../Lucy/Index/SegReader.html">SegReader</a>&#39;s sub-components.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="open-labeled-params-">open( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index    =&gt; &#39;/path/to/index&#39;, # required
+        snapshot =&gt; $snapshot,
+        manager  =&gt; $index_manager,
+    );</code></pre>
+
+<p>IndexReader is an abstract base class; open() returns the IndexReader subclass PolyReader, which channels the output of 0 or more SegReaders.</p>
+
+<ul>
+
+<li><p><b>index</b> - Either a string filepath or a Folder.</p>
+
+</li>
+<li><p><b>snapshot</b> - A Snapshot. If not supplied, the most recent snapshot file will be used.</p>
+
+</li>
+<li><p><b>manager</b> - An <a href="../../Lucy/Index/IndexManager.html">IndexManager</a>. Read-locking is off by default; supplying this argument turns it on.</p>
+
+</li>
+</ul>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="doc_max-">doc_max()</h2>
+
+<p>Return the maximum number of documents available to the reader, which is also the highest possible internal document id. Documents which have been marked as deleted but not yet purged from the index are included in this count.</p>
+
+<h2 id="doc_count-">doc_count()</h2>
+
+<p>Return the number of documents available to the reader, subtracting any that are marked as deleted.</p>
+
+<h2 id="del_count-">del_count()</h2>
+
+<p>Return the number of documents which have been marked as deleted but not yet purged from the index.</p>
+
+<h2 id="seg_readers-">seg_readers()</h2>
+
+<p>Return an array of all the SegReaders represented within the IndexReader.</p>
+
+<h2 id="offsets-">offsets()</h2>
+
+<p>Return an array with one entry for each segment, corresponding to segment doc_id start offset.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="fetch-api-">fetch(api)</h2>
+
+<p>Fetch a component, or return undef if the component can&#39;t be found.</p>
+
+<ul>
+
+<li><p><b>api</b> - The name of the DataReader subclass that the desired component must implement.</p>
+
+</li>
+</ul>
+
+<h2 id="obtain-api-">obtain(api)</h2>
+
+<p>Fetch a component, or throw an error if the component can&#39;t be found.</p>
+
+<ul>
+
+<li><p><b>api</b> - The name of the DataReader subclass that the desired component must implement.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::IndexReader isa <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/Indexer.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/Indexer.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/Indexer.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,149 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::Indexer - Build inverted indexes.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $indexer = Lucy::Index::Indexer-&gt;new(
+        schema =&gt; $schema,
+        index  =&gt; &#39;/path/to/index&#39;,
+        create =&gt; 1,
+    );
+    while ( my ( $title, $content ) = each %source_docs ) {
+        $indexer-&gt;add_doc({
+            title   =&gt; $title,
+            content =&gt; $content,
+        });
+    }
+    $indexer-&gt;commit;</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>The Indexer class is Apache Lucy&#39;s primary tool for managing the content of inverted indexes, which may later be searched using <a href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a>.</p>
+
+<p>In general, only one Indexer at a time may write to an index safely. If a write lock cannot be secured, new() will throw an exception.</p>
+
+<p>If an index is located on a shared volume, each writer application must identify itself by supplying an <a href="../../Lucy/Index/IndexManager.html">IndexManager</a> with a unique <code>host</code> id to Indexer&#39;s constructor or index corruption will occur. See <a href="../../Lucy/Docs/FileLocking.html">Lucy::Docs::FileLocking</a> for a detailed discussion.</p>
+
+<p>Note: at present, delete_by_term() and delete_by_query() only affect documents which had been previously committed to the index -- and not any documents added this indexing session but not yet committed. This may change in a future update.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params-">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $indexer = Lucy::Index::Indexer-&gt;new(
+        schema   =&gt; $schema,             # required at index creation
+        index    =&gt; &#39;/path/to/index&#39;,    # required
+        create   =&gt; 1,                   # default: 0
+        truncate =&gt; 1,                   # default: 0
+        manager  =&gt; $manager             # default: created internally
+    );</code></pre>
+
+<ul>
+
+<li><p><b>schema</b> - A Schema. Required when index is being created; if not supplied, will be extracted from the index folder.</p>
+
+</li>
+<li><p><b>index</b> - Either a filepath to an index or a Folder.</p>
+
+</li>
+<li><p><b>create</b> - If true and the index directory does not exist, attempt to create it.</p>
+
+</li>
+<li><p><b>truncate</b> - If true, proceed with the intention of discarding all previous indexing data. The old data will remain intact and visible until commit() succeeds.</p>
+
+</li>
+<li><p><b>manager</b> - An IndexManager.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="add_doc-...-">add_doc(...)</h2>
+
+<pre><code>    $indexer-&gt;add_doc($doc);
+    $indexer-&gt;add_doc( { field_name =&gt; $field_value } );
+    $indexer-&gt;add_doc(
+        doc   =&gt; { field_name =&gt; $field_value },
+        boost =&gt; 2.5,         # default: 1.0
+    );</code></pre>
+
+<p>Add a document to the index. Accepts either a single argument or labeled params.</p>
+
+<ul>
+
+<li><p><b>doc</b> - Either a Lucy::Document::Doc object, or a hashref (which will be attached to a Lucy::Document::Doc object internally).</p>
+
+</li>
+<li><p><b>boost</b> - A floating point weight which affects how this document scores.</p>
+
+</li>
+</ul>
+
+<h2 id="add_index-index-">add_index(index)</h2>
+
+<p>Absorb an existing index into this one. The two indexes must have matching Schemas.</p>
+
+<ul>
+
+<li><p><b>index</b> - Either an index path name or a Folder.</p>
+
+</li>
+</ul>
+
+<h2 id="optimize-">optimize()</h2>
+
+<p>Optimize the index for search-time performance. This may take a while, as it can involve rewriting large amounts of data.</p>
+
+<h2 id="commit-">commit()</h2>
+
+<p>Commit any changes made to the index. Until this is called, none of the changes made during an indexing session are permanent.</p>
+
+<p>Calling commit() invalidates the Indexer, so if you want to make more changes you&#39;ll need a new one.</p>
+
+<h2 id="prepare_commit-">prepare_commit()</h2>
+
+<p>Perform the expensive setup for commit() in advance, so that commit() completes quickly. (If prepare_commit() is not called explicitly by the user, commit() will call it internally.)</p>
+
+<h2 id="delete_by_term-labeled-params-">delete_by_term( <i>[labeled params]</i> )</h2>
+
+<p>Mark documents which contain the supplied term as deleted, so that they will be excluded from search results and eventually removed altogether. The change is not apparent to search apps until after commit() succeeds.</p>
+
+<ul>
+
+<li><p><b>field</b> - The name of an indexed field. (If it is not spec&#39;d as <code>indexed</code>, an error will occur.)</p>
+
+</li>
+<li><p><b>term</b> - The term which identifies docs to be marked as deleted. If <code>field</code> is associated with an Analyzer, <code>term</code> will be processed automatically (so don&#39;t pre-process it yourself).</p>
+
+</li>
+</ul>
+
+<h2 id="delete_by_query-query-">delete_by_query(query)</h2>
+
+<p>Mark documents which match the supplied Query as deleted.</p>
+
+<ul>
+
+<li><p><b>query</b> - A <a href="../../Lucy/Search/Query.html">Query</a>.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::Indexer isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/Lexicon.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/Lexicon.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/Lexicon.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,59 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::Lexicon - Iterator for a field&#39;s terms.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $lex_reader = $seg_reader-&gt;obtain(&#39;Lucy::Index::LexiconReader&#39;);
+    my $lexicon = $lex_reader-&gt;lexicon( field =&gt; &#39;content&#39; );
+    while ( $lexicon-&gt;next ) {
+       print $lexicon-&gt;get_term . &quot;\n&quot;;
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>A Lexicon is an iterator which provides access to all the unique terms for a given field in sorted order.</p>
+
+<p>If an index consists of two documents with a &#39;content&#39; field holding &quot;three blind mice&quot; and &quot;three musketeers&quot; respectively, then iterating through the &#39;content&#39; field&#39;s lexicon would produce this list:</p>
+
+<pre><code>    blind
+    mice
+    musketeers
+    three</code></pre>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="seek-target-">seek(target)</h2>
+
+<p>Seek the Lexicon to the first iterator state which is greater than or equal to <code>target</code>. If <code>target</code> is undef, reset the iterator.</p>
+
+<h2 id="next-">next()</h2>
+
+<p>Proceed to the next term.</p>
+
+<p>Returns: true until the iterator is exhausted, then false.</p>
+
+<h2 id="get_term-">get_term()</h2>
+
+<p>Return the current term, or undef if the iterator is not in a valid state.</p>
+
+<h2 id="reset-">reset()</h2>
+
+<p>Reset the iterator. next() must be called to proceed to the first element.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::Lexicon isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/LexiconReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/LexiconReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/LexiconReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,49 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::LexiconReader - Read Lexicon data.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $lex_reader = $seg_reader-&gt;obtain(&quot;Lucy::Index::LexiconReader&quot;);
+    my $lexicon    = $lex_reader-&gt;lexicon( field =&gt; &#39;title&#39; );</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>LexiconReader reads term dictionary information.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="lexicon-labeled-params-">lexicon( <i>[labeled params]</i> )</h2>
+
+<p>Return a new Lexicon for the given <code>field</code>. Will return undef if either the field is not indexed, or if no documents contain a value for the field.</p>
+
+<ul>
+
+<li><p><b>field</b> - Field name.</p>
+
+</li>
+<li><p><b>term</b> - Pre-locate the Lexicon to this term.</p>
+
+</li>
+</ul>
+
+<h2 id="doc_freq-labeled-params-">doc_freq( <i>[labeled params]</i> )</h2>
+
+<p>Return the number of documents where the specified term is present.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::LexiconReader isa <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PolyReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PolyReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PolyReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,37 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::PolyReader - Multi-segment implementation of IndexReader.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $polyreader = Lucy::Index::IndexReader-&gt;open( 
+        index =&gt; &#39;/path/to/index&#39;,
+    );
+    my $doc_reader = $polyreader-&gt;obtain(&quot;Lucy::Index::DocReader&quot;);
+    for my $doc_id ( 1 .. $polyreader-&gt;doc_max ) {
+        my $doc = $doc_reader-&gt;fetch_doc($doc_id);
+        print &quot; $doc_id: $doc-&gt;{title}\n&quot;;
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>PolyReader conflates index data from multiple segments. For instance, if an index contains three segments with 10 documents each, PolyReader&#39;s doc_max() method will return 30.</p>
+
+<p>Some of PolyReader&#39;s <a href="../../Lucy/Index/DataReader.html">DataReader</a> components may be less efficient or complete than the single-segment implementations accessed via <a href="../../Lucy/Index/SegReader.html">SegReader</a>.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::PolyReader isa <a href="../../Lucy/Index/IndexReader.html">Lucy::Index::IndexReader</a> isa <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PostingList.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PostingList.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PostingList.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,80 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::PostingList - Term-Document pairings.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $posting_list_reader 
+        = $seg_reader-&gt;obtain(&quot;Lucy::Index::PostingListReader&quot;);
+    my $posting_list = $posting_list_reader-&gt;posting_list( 
+        field =&gt; &#39;content&#39;,
+        term  =&gt; &#39;foo&#39;,
+    );
+    while ( my $doc_id = $posting_list-&gt;next ) {
+        say &quot;Matching doc id: $doc_id&quot;;
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>PostingList is an iterator which supplies a list of document ids that match a given term.</p>
+
+<p>See <a href="../../Lucy/Docs/IRTheory.html">Lucy::Docs::IRTheory</a> for definitions of &quot;posting&quot; and &quot;posting list&quot;.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="next-">next()</h2>
+
+<p>Proceed to the next doc id.</p>
+
+<p>Returns: A positive doc id, or 0 once the iterator is exhausted.</p>
+
+<h2 id="get_doc_id-">get_doc_id()</h2>
+
+<p>Return the current doc id. Valid only after a successful call to next() or advance() and must not be called otherwise.</p>
+
+<h2 id="get_doc_freq-">get_doc_freq()</h2>
+
+<p>Return the number of documents that the PostingList contains. (This number will include any documents which have been marked as deleted but not yet purged.)</p>
+
+<h2 id="seek-target-">seek(target)</h2>
+
+<p>Prepare the PostingList object to iterate over matches for documents that match <code>target</code>.</p>
+
+<ul>
+
+<li><p><b>target</b> - The term to match. If undef, the iterator will be empty.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="advance-target-">advance(target)</h2>
+
+<p>Advance the iterator to the first doc id greater than or equal to <code>target</code>. The default implementation simply calls next() over and over, but subclasses have the option of doing something more efficient.</p>
+
+<ul>
+
+<li><p><b>target</b> - A positive doc id, which must be greater than the current doc id once the iterator has been initialized.</p>
+
+</li>
+</ul>
+
+<p>Returns: A positive doc id, or 0 once the iterator is exhausted.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::PostingList isa <a href="../../Lucy/Search/Matcher.html">Lucy::Search::Matcher</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PostingListReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PostingListReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/PostingListReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,49 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::PostingListReader - Read postings data.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $posting_list_reader 
+        = $seg_reader-&gt;obtain(&quot;Lucy::Index::PostingListReader&quot;);
+    my $posting_list = $posting_list_reader-&gt;posting_list(
+        field =&gt; &#39;title&#39;, 
+        term  =&gt; &#39;foo&#39;,
+    );</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>PostingListReaders produce <a href="../../Lucy/Index/PostingList.html">PostingList</a> objects which convey document matching information.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="posting_list-labeled-params-">posting_list( <i>[labeled params]</i> )</h2>
+
+<p>Returns a PostingList, or undef if either <code>field</code> is undef or <code>field</code> is not present in any documents.</p>
+
+<ul>
+
+<li><p><b>field</b> - A field name.</p>
+
+</li>
+<li><p><b>term</b> - If supplied, the PostingList will be pre-located to this term using seek().</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::PostingListReader isa <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/SegReader.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/SegReader.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/SegReader.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,55 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::SegReader - Single-segment IndexReader.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $polyreader = Lucy::Index::IndexReader-&gt;open(
+        index =&gt; &#39;/path/to/index&#39;,
+    );
+    my $seg_readers = $polyreader-&gt;seg_readers;
+    for my $seg_reader (@$seg_readers) {
+        my $seg_name = $seg_reader-&gt;get_seg_name;
+        my $num_docs = $seg_reader-&gt;doc_max;
+        print &quot;Segment $seg_name ($num_docs documents):\n&quot;;
+        my $doc_reader = $seg_reader-&gt;obtain(&quot;Lucy::Index::DocReader&quot;);
+        for my $doc_id ( 1 .. $num_docs ) {
+            my $doc = $doc_reader-&gt;fetch_doc($doc_id);
+            print &quot;  $doc_id: $doc-&gt;{title}\n&quot;;
+        }
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>SegReader interprets the data within a single segment of an index.</p>
+
+<p>Generally speaking, only advanced users writing subclasses which manipulate data at the segment level need to deal with the SegReader API directly.</p>
+
+<p>Nearly all of SegReader&#39;s functionality is implemented by pluggable components spawned by <a href="../../Lucy/Plan/Architecture.html">Architecture</a>&#39;s factory methods.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="get_seg_name-">get_seg_name()</h2>
+
+<p>Return the name of the segment.</p>
+
+<h2 id="get_seg_num-">get_seg_num()</h2>
+
+<p>Return the number of the segment.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::SegReader isa <a href="../../Lucy/Index/IndexReader.html">Lucy::Index::IndexReader</a> isa <a href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+

Added: websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/SegWriter.html
==============================================================================
--- websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/SegWriter.html (added)
+++ websites/staging/lucy/trunk/content/lucy/docs/perl/Lucy/Index/SegWriter.html Wed Aug 24 00:26:06 2011
@@ -0,0 +1,61 @@
+
+<html>
+<head>
+<title></title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::SegWriter - Write one segment of an index.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>SegWriter is a conduit through which information fed to Indexer passes. It manages <a href="../../Lucy/Index/Segment.html">Segment</a> and Inverter, invokes the <a href="../../Lucy/Analysis/Analyzer.html">Analyzer</a> chain, and feeds low level <a href="../../Lucy/Index/DataWriter.html">DataWriters</a> such as PostingListWriter and DocWriter.</p>
+
+<p>The sub-components of a SegWriter are determined by <a href="../../Lucy/Plan/Architecture.html">Architecture</a>. DataWriter components which are added to the stack of writers via add_writer() have add_inverted_doc() invoked for each document supplied to SegWriter&#39;s add_doc().</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="add_doc-labeled-params-">add_doc( <i>[labeled params]</i> )</h2>
+
+<p>Add a document to the segment. Inverts <code>doc</code>, increments the Segment&#39;s internal document id, then calls add_inverted_doc(), feeding all sub-writers.</p>
+
+<h2 id="add_writer-writer-">add_writer(writer)</h2>
+
+<p>Add a DataWriter to the SegWriter&#39;s stack of writers.</p>
+
+<h2 id="register-labeled-params-">register( <i>[labeled params]</i> )</h2>
+
+<p>Register a DataWriter component with the SegWriter. (Note that registration simply makes the writer available via fetch(), so you may also want to call add_writer()).</p>
+
+<ul>
+
+<li><p><b>api</b> - The name of the DataWriter api which <code>writer</code> implements.</p>
+
+</li>
+<li><p><b>component</b> - A DataWriter.</p>
+
+</li>
+</ul>
+
+<h2 id="fetch-api-">fetch(api)</h2>
+
+<p>Retrieve a registered component.</p>
+
+<ul>
+
+<li><p><b>api</b> - The name of the DataWriter api which the component implements.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::SegWriter isa <a href="../../Lucy/Index/DataWriter.html">Lucy::Index::DataWriter</a> isa <a href="../../Lucy/Object/Obj.html">Lucy::Object::Obj</a>.</p>
+
+</body>
+</html>
+