You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucy.apache.org by bu...@apache.org on 2016/09/28 12:07:52 UTC
svn commit: r998475 [5/26] - in /websites/staging/lucy/trunk/content: ./
docs/ docs/0.5.0/ docs/0.5.0/c/ docs/0.5.0/c/Clownfish/
docs/0.5.0/c/Clownfish/Docs/ docs/0.5.0/c/Lucy/ docs/0.5.0/c/Lucy/Analysis/
docs/0.5.0/c/Lucy/Docs/ docs/0.5.0/c/Lucy/Docs/...
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/SnowballStopFilter.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/SnowballStopFilter.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/SnowballStopFilter.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,296 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Analysis::SnowballStopFilter – C API Documentation</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Analysis/">Analysis</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Lucy::Analysis::SnowballStopFilter</h2>
+<table>
+<tr>
+<td class="label">parcel</td>
+<td><a href="../../lucy.html">Lucy</a></td>
+</tr>
+<tr>
+<td class="label">class variable</td>
+<td><code><span class="prefix">LUCY_</span>SNOWBALLSTOPFILTER</code></td>
+</tr>
+<tr>
+<td class="label">struct symbol</td>
+<td><code><span class="prefix">lucy_</span>SnowballStopFilter</code></td>
+</tr>
+<tr>
+<td class="label">class nickname</td>
+<td><code><span class="prefix">lucy_</span>SnowStop</code></td>
+</tr>
+<tr>
+<td class="label">header file</td>
+<td><code>Lucy/Analysis/SnowballStopFilter.h</code></td>
+</tr>
+</table>
+<h3>Name</h3>
+<p>Lucy::Analysis::SnowballStopFilter – Suppress a “stoplist” of common words.</p>
+<h3>Description</h3>
+<p>A “stoplist” is collection of “stopwords”: words which are common enough to
+be of little value when determining search results. For example, so many
+documents in English contain “the”, “if”, and “maybe” that it may improve
+both performance and relevance to block them.</p>
+<p>Before filtering stopwords:</p>
+<pre><code>("i", "am", "the", "walrus")
+</code></pre>
+<p>After filtering stopwords:</p>
+<pre><code>("walrus")
+</code></pre>
+<p>SnowballStopFilter provides default stoplists for several languages,
+courtesy of the <a href="http://snowball.tartarus.org">Snowball project</a>, or you may
+supply your own.</p>
+<pre><code>|-----------------------|
+| ISO CODE | LANGUAGE |
+|-----------------------|
+| da | Danish |
+| de | German |
+| en | English |
+| es | Spanish |
+| fi | Finnish |
+| fr | French |
+| hu | Hungarian |
+| it | Italian |
+| nl | Dutch |
+| no | Norwegian |
+| pt | Portuguese |
+| sv | Swedish |
+| ru | Russian |
+|-----------------------|
+</code></pre>
+<h3>Functions</h3>
+<dl>
+<dt id="func_new">new</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span>SnowballStopFilter* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>SnowStop_new</strong>(
+ <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>language</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/Hash.html">Hash</a> *<strong>stoplist</strong>
+);
+</code></pre>
+<p>Create a new SnowballStopFilter.</p>
+<dl>
+<dt>stoplist</dt>
+<dd><p>A hash with stopwords as the keys.</p>
+</dd>
+<dt>language</dt>
+<dd><p>The ISO code for a supported language.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_init">init</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span>SnowballStopFilter*
+<span class="prefix">lucy_</span><strong>SnowStop_init</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>language</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/Hash.html">Hash</a> *<strong>stoplist</strong>
+);
+</code></pre>
+<p>Initialize a SnowballStopFilter.</p>
+<dl>
+<dt>stoplist</dt>
+<dd><p>A hash with stopwords as the keys.</p>
+</dd>
+<dt>language</dt>
+<dd><p>The ISO code for a supported language.</p>
+</dd>
+</dl>
+</dd>
+</dl>
+<h3>Methods</h3>
+<dl>
+<dt id="func_Transform">Transform</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>SnowStop_Transform</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>,
+ <span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a> *<strong>inversion</strong>
+);
+</code></pre>
+<p>Take a single <a href="../../Lucy/Analysis/Inversion.html">Inversion</a> as input
+and returns an Inversion, either the same one (presumably transformed
+in some way), or a new one.</p>
+<dl>
+<dt>inversion</dt>
+<dd><p>An inversion.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_Equals">Equals</dt>
+<dd>
+<pre><code>bool
+<span class="prefix">lucy_</span><strong>SnowStop_Equals</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>other</strong>
+);
+</code></pre>
+<p>Indicate whether two objects are the same. By default, compares the
+memory address.</p>
+<dl>
+<dt>other</dt>
+<dd><p>Another Obj.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_Dump">Dump</dt>
+<dd>
+<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>SnowStop_Dump</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>
+);
+</code></pre>
+<p>Dump the analyzer as hash.</p>
+<p>Subclasses should call <a href="../../Lucy/Analysis/SnowballStopFilter.html#func_Dump">Dump()</a> on the superclass. The returned
+object is a hash which should be populated with parameters of
+the analyzer.</p>
+<p><strong>Returns:</strong> A hash containing a description of the analyzer.</p>
+</dd>
+<dt id="func_Load">Load</dt>
+<dd>
+<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>SnowStop_Load</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>dump</strong>
+);
+</code></pre>
+<p>Reconstruct an analyzer from a dump.</p>
+<p>Subclasses should first call <a href="../../Lucy/Analysis/SnowballStopFilter.html#func_Load">Load()</a> on the superclass. The
+returned object is an analyzer which should be reconstructed by
+setting the dumped parameters from the hash contained in <code>dump</code>.</p>
+<p>Note that the invocant analyzer is unused.</p>
+<dl>
+<dt>dump</dt>
+<dd><p>A hash.</p>
+</dd>
+</dl>
+<p><strong>Returns:</strong> An analyzer.</p>
+</dd>
+</dl>
+<h4>Methods inherited from Lucy::Analysis::Analyzer</h4>
+<dl>
+<dt id="func_Transform_Text">Transform_Text</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>SnowStop_Transform_Text</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong>
+);
+</code></pre>
+<p>Kick off an analysis chain, creating an Inversion from string input.
+The default implementation simply creates an initial Inversion with a
+single Token, then calls <a href="../../Lucy/Analysis/SnowballStopFilter.html#func_Transform">Transform()</a>, but occasionally subclasses will
+provide an optimized implementation which minimizes string copies.</p>
+<dl>
+<dt>text</dt>
+<dd><p>A string.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_Split">Split</dt>
+<dd>
+<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>SnowStop_Split</strong>(
+ <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong>
+);
+</code></pre>
+<p>Analyze text and return an array of token texts.</p>
+<dl>
+<dt>text</dt>
+<dd><p>A string.</p>
+</dd>
+</dl>
+</dd>
+</dl>
+<h3>Inheritance</h3>
+<p>Lucy::Analysis::SnowballStopFilter is a <a href="../../Lucy/Analysis/Analyzer.html">Lucy::Analysis::Analyzer</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/StandardTokenizer.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/StandardTokenizer.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/StandardTokenizer.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,250 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Analysis::StandardTokenizer – C API Documentation</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Analysis/">Analysis</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Lucy::Analysis::StandardTokenizer</h2>
+<table>
+<tr>
+<td class="label">parcel</td>
+<td><a href="../../lucy.html">Lucy</a></td>
+</tr>
+<tr>
+<td class="label">class variable</td>
+<td><code><span class="prefix">LUCY_</span>STANDARDTOKENIZER</code></td>
+</tr>
+<tr>
+<td class="label">struct symbol</td>
+<td><code><span class="prefix">lucy_</span>StandardTokenizer</code></td>
+</tr>
+<tr>
+<td class="label">class nickname</td>
+<td><code><span class="prefix">lucy_</span>StandardTokenizer</code></td>
+</tr>
+<tr>
+<td class="label">header file</td>
+<td><code>Lucy/Analysis/StandardTokenizer.h</code></td>
+</tr>
+</table>
+<h3>Name</h3>
+<p>Lucy::Analysis::StandardTokenizer – Split a string into tokens.</p>
+<h3>Description</h3>
+<p>Generically, “tokenizing” is a process of breaking up a string into an
+array of “tokens”. For instance, the string “three blind mice” might be
+tokenized into “three”, “blind”, “mice”.</p>
+<p>Lucy::Analysis::StandardTokenizer breaks up the text at the word
+boundaries defined in Unicode Standard Annex #29. It then returns those
+words that contain alphabetic or numeric characters.</p>
+<h3>Functions</h3>
+<dl>
+<dt id="func_new">new</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span>StandardTokenizer* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>StandardTokenizer_new</strong>(void);
+</code></pre>
+<p>Constructor. Takes no arguments.</p>
+</dd>
+<dt id="func_init">init</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span>StandardTokenizer*
+<span class="prefix">lucy_</span><strong>StandardTokenizer_init</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>
+);
+</code></pre>
+<p>Initialize a StandardTokenizer.</p>
+</dd>
+</dl>
+<h3>Methods</h3>
+<dl>
+<dt id="func_Transform">Transform</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>StandardTokenizer_Transform</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>,
+ <span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a> *<strong>inversion</strong>
+);
+</code></pre>
+<p>Take a single <a href="../../Lucy/Analysis/Inversion.html">Inversion</a> as input
+and returns an Inversion, either the same one (presumably transformed
+in some way), or a new one.</p>
+<dl>
+<dt>inversion</dt>
+<dd><p>An inversion.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_Transform_Text">Transform_Text</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>StandardTokenizer_Transform_Text</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong>
+);
+</code></pre>
+<p>Kick off an analysis chain, creating an Inversion from string input.
+The default implementation simply creates an initial Inversion with a
+single Token, then calls <a href="../../Lucy/Analysis/StandardTokenizer.html#func_Transform">Transform()</a>, but occasionally subclasses will
+provide an optimized implementation which minimizes string copies.</p>
+<dl>
+<dt>text</dt>
+<dd><p>A string.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_Equals">Equals</dt>
+<dd>
+<pre><code>bool
+<span class="prefix">lucy_</span><strong>StandardTokenizer_Equals</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>other</strong>
+);
+</code></pre>
+<p>Indicate whether two objects are the same. By default, compares the
+memory address.</p>
+<dl>
+<dt>other</dt>
+<dd><p>Another Obj.</p>
+</dd>
+</dl>
+</dd>
+</dl>
+<h4>Methods inherited from Lucy::Analysis::Analyzer</h4>
+<dl>
+<dt id="func_Split">Split</dt>
+<dd>
+<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>StandardTokenizer_Split</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong>
+);
+</code></pre>
+<p>Analyze text and return an array of token texts.</p>
+<dl>
+<dt>text</dt>
+<dd><p>A string.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_Dump">Dump</dt>
+<dd>
+<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>StandardTokenizer_Dump</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>
+);
+</code></pre>
+<p>Dump the analyzer as hash.</p>
+<p>Subclasses should call <a href="../../Lucy/Analysis/StandardTokenizer.html#func_Dump">Dump()</a> on the superclass. The returned
+object is a hash which should be populated with parameters of
+the analyzer.</p>
+<p><strong>Returns:</strong> A hash containing a description of the analyzer.</p>
+</dd>
+<dt id="func_Load">Load</dt>
+<dd>
+<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>StandardTokenizer_Load</strong>(
+ <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>,
+ <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>dump</strong>
+);
+</code></pre>
+<p>Reconstruct an analyzer from a dump.</p>
+<p>Subclasses should first call <a href="../../Lucy/Analysis/StandardTokenizer.html#func_Load">Load()</a> on the superclass. The
+returned object is an analyzer which should be reconstructed by
+setting the dumped parameters from the hash contained in <code>dump</code>.</p>
+<p>Note that the invocant analyzer is unused.</p>
+<dl>
+<dt>dump</dt>
+<dd><p>A hash.</p>
+</dd>
+</dl>
+<p><strong>Returns:</strong> An analyzer.</p>
+</dd>
+</dl>
+<h3>Inheritance</h3>
+<p>Lucy::Analysis::StandardTokenizer is a <a href="../../Lucy/Analysis/Analyzer.html">Lucy::Analysis::Analyzer</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/Token.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/Token.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/Token.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,282 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Analysis::Token – C API Documentation</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Analysis/">Analysis</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Lucy::Analysis::Token</h2>
+<table>
+<tr>
+<td class="label">parcel</td>
+<td><a href="../../lucy.html">Lucy</a></td>
+</tr>
+<tr>
+<td class="label">class variable</td>
+<td><code><span class="prefix">LUCY_</span>TOKEN</code></td>
+</tr>
+<tr>
+<td class="label">struct symbol</td>
+<td><code><span class="prefix">lucy_</span>Token</code></td>
+</tr>
+<tr>
+<td class="label">class nickname</td>
+<td><code><span class="prefix">lucy_</span>Token</code></td>
+</tr>
+<tr>
+<td class="label">header file</td>
+<td><code>Lucy/Analysis/Token.h</code></td>
+</tr>
+</table>
+<h3>Name</h3>
+<p>Lucy::Analysis::Token – Unit of text.</p>
+<h3>Description</h3>
+<p>Token is the fundamental unit used by Apache Lucy’s Analyzer subclasses.
+Each Token has 5 attributes: <code>text</code>, <code>start_offset</code>,
+<code>end_offset</code>, <code>boost</code>, and <code>pos_inc</code>.</p>
+<p>The <code>text</code> attribute is a Unicode string encoded as UTF-8.</p>
+<p><code>start_offset</code> is the start point of the token text, measured in
+Unicode code points from the top of the stored field;
+<code>end_offset</code> delimits the corresponding closing boundary.
+<code>start_offset</code> and <code>end_offset</code> locate the Token
+within a larger context, even if the Token’s text attribute gets modified
+– by stemming, for instance. The Token for “beating” in the text “beating
+a dead horse” begins life with a start_offset of 0 and an end_offset of 7;
+after stemming, the text is “beat”, but the start_offset is still 0 and the
+end_offset is still 7. This allows “beating” to be highlighted correctly
+after a search matches “beat”.</p>
+<p><code>boost</code> is a per-token weight. Use this when you want to assign
+more or less importance to a particular token, as you might for emboldened
+text within an HTML document, for example. (Note: The field this token
+belongs to must be spec’d to use a posting of type RichPosting.)</p>
+<p><code>pos_inc</code> is the POSition INCrement, measured in Tokens. This
+attribute, which defaults to 1, is a an advanced tool for manipulating
+phrase matching. Ordinarily, Tokens are assigned consecutive position
+numbers: 0, 1, and 2 for <code>"three blind mice"</code>. However, if you
+set the position increment for “blind” to, say, 1000, then the three tokens
+will end up assigned to positions 0, 1, and 1001 – and will no longer
+produce a phrase match for the query <code>"three blind mice"</code>.</p>
+<h3>Functions</h3>
+<dl>
+<dt id="func_new">new</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span>Token* <span class="comment">// incremented</span>
+<span class="prefix">lucy_</span><strong>Token_new</strong>(
+ char *<strong>text</strong>,
+ size_t <strong>len</strong>,
+ uint32_t <strong>start_offset</strong>,
+ uint32_t <strong>end_offset</strong>,
+ float <strong>boost</strong>,
+ int32_t <strong>pos_inc</strong>
+);
+</code></pre>
+<p>Create a new Token.</p>
+<dl>
+<dt>text</dt>
+<dd><p>A UTF-8 string.</p>
+</dd>
+<dt>len</dt>
+<dd><p>Size of the string in bytes.</p>
+</dd>
+<dt>start_offset</dt>
+<dd><p>Start offset into the original document in Unicode
+code points.</p>
+</dd>
+<dt>start_offset</dt>
+<dd><p>End offset into the original document in Unicode
+code points.</p>
+</dd>
+<dt>boost</dt>
+<dd><p>Per-token weight.</p>
+</dd>
+<dt>pos_inc</dt>
+<dd><p>Position increment for phrase matching.</p>
+</dd>
+</dl>
+</dd>
+<dt id="func_init">init</dt>
+<dd>
+<pre><code><span class="prefix">lucy_</span>Token*
+<span class="prefix">lucy_</span><strong>Token_init</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>,
+ char *<strong>text</strong>,
+ size_t <strong>len</strong>,
+ uint32_t <strong>start_offset</strong>,
+ uint32_t <strong>end_offset</strong>,
+ float <strong>boost</strong>,
+ int32_t <strong>pos_inc</strong>
+);
+</code></pre>
+<p>Initialize a Token.</p>
+<dl>
+<dt>text</dt>
+<dd><p>A UTF-8 string.</p>
+</dd>
+<dt>len</dt>
+<dd><p>Size of the string in bytes.</p>
+</dd>
+<dt>start_offset</dt>
+<dd><p>Start offset into the original document in Unicode
+code points.</p>
+</dd>
+<dt>start_offset</dt>
+<dd><p>End offset into the original document in Unicode
+code points.</p>
+</dd>
+<dt>boost</dt>
+<dd><p>Per-token weight.</p>
+</dd>
+<dt>pos_inc</dt>
+<dd><p>Position increment for phrase matching.</p>
+</dd>
+</dl>
+</dd>
+</dl>
+<h3>Methods</h3>
+<dl>
+<dt id="func_Get_Start_Offset">Get_Start_Offset</dt>
+<dd>
+<pre><code>uint32_t
+<span class="prefix">lucy_</span><strong>Token_Get_Start_Offset</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>
+);
+</code></pre>
+</dd>
+<dt id="func_Get_End_Offset">Get_End_Offset</dt>
+<dd>
+<pre><code>uint32_t
+<span class="prefix">lucy_</span><strong>Token_Get_End_Offset</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>
+);
+</code></pre>
+</dd>
+<dt id="func_Get_Boost">Get_Boost</dt>
+<dd>
+<pre><code>float
+<span class="prefix">lucy_</span><strong>Token_Get_Boost</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>
+);
+</code></pre>
+</dd>
+<dt id="func_Get_Pos_Inc">Get_Pos_Inc</dt>
+<dd>
+<pre><code>int32_t
+<span class="prefix">lucy_</span><strong>Token_Get_Pos_Inc</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>
+);
+</code></pre>
+</dd>
+<dt id="func_Get_Text">Get_Text</dt>
+<dd>
+<pre><code>char*
+<span class="prefix">lucy_</span><strong>Token_Get_Text</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>
+);
+</code></pre>
+</dd>
+<dt id="func_Get_Len">Get_Len</dt>
+<dd>
+<pre><code>size_t
+<span class="prefix">lucy_</span><strong>Token_Get_Len</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>
+);
+</code></pre>
+</dd>
+<dt id="func_Set_Text">Set_Text</dt>
+<dd>
+<pre><code>void
+<span class="prefix">lucy_</span><strong>Token_Set_Text</strong>(
+ <span class="prefix">lucy_</span>Token *<strong>self</strong>,
+ char *<strong>text</strong>,
+ size_t <strong>len</strong>
+);
+</code></pre>
+</dd>
+</dl>
+<h3>Inheritance</h3>
+<p>Lucy::Analysis::Token is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,120 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Docs::Cookbook</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Apache Lucy recipes</h2>
+<p>The Cookbook provides thematic documentation covering some of Apache Lucy’s
+more sophisticated features. For a step-by-step introduction to Lucy,
+see <a href="../../Lucy/Docs/Tutorial.html">Tutorial</a>.</p>
+<h3>Chapters</h3>
+<ul>
+<li>
+<p><a href="../../Lucy/Docs/Cookbook/FastUpdates.html">FastUpdates</a> - While index updates are fast on
+average, worst-case update performance may be significantly slower. To make
+index updates consistently quick, we must manually intervene to control the
+process of index segment consolidation.</p>
+</li>
+<li>
+<p><a href="../../Lucy/Docs/Cookbook/CustomQuery.html">CustomQuery</a> - Explore Lucy’s support for
+custom query types by creating a “PrefixQuery” class to handle trailing
+wildcards.</p>
+</li>
+<li>
+<p><a href="../../Lucy/Docs/Cookbook/CustomQueryParser.html">CustomQueryParser</a> - Define your own custom
+search query syntax using <a href="../../Lucy/Search/QueryParser.html">QueryParser</a> and
+Parse::RecDescent.</p>
+</li>
+</ul>
+<h3>Materials</h3>
+<p>Some of the recipes in the Cookbook reference the completed
+<a href="../../Lucy/Docs/Tutorial.html">Tutorial</a> application. These materials can be
+found in the <code>sample</code> directory at the root of the Lucy distribution:</p>
+<pre><code>Code example for C is missing</code></pre>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQuery.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQuery.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQuery.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,190 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Docs::Cookbook::CustomQuery</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a> » <a href="/docs/0.5.0/c/Lucy/Docs/Cookbook/">Cookbook</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Sample subclass of Query</h2>
+<p>Explore Apache Lucy’s support for custom query types by creating a
+“PrefixQuery” class to handle trailing wildcards.</p>
+<pre><code>Code example for C is missing</code></pre>
+<h3>Query, Compiler, and Matcher</h3>
+<p>To add support for a new query type, we need three classes: a Query, a
+Compiler, and a Matcher.</p>
+<ul>
+<li>
+<p>PrefixQuery - a subclass of <a href="../../../Lucy/Search/Query.html">Query</a>, and the only class
+that client code will deal with directly.</p>
+</li>
+<li>
+<p>PrefixCompiler - a subclass of <a href="../../../Lucy/Search/Compiler.html">Compiler</a>, whose primary
+role is to compile a PrefixQuery to a PrefixMatcher.</p>
+</li>
+<li>
+<p>PrefixMatcher - a subclass of <a href="../../../Lucy/Search/Matcher.html">Matcher</a>, which does the
+heavy lifting: it applies the query to individual documents and assigns a
+score to each match.</p>
+</li>
+</ul>
+<p>The PrefixQuery class on its own isn’t enough because a Query object’s role is
+limited to expressing an abstract specification for the search. A Query is
+basically nothing but metadata; execution is left to the Query’s companion
+Compiler and Matcher.</p>
+<p>Here’s a simplified sketch illustrating how a Searcher’s hits() method ties
+together the three classes.</p>
+<pre><code>Code example for C is missing</code></pre>
+<h4>PrefixQuery</h4>
+<p>Our PrefixQuery class will have two attributes: a query string and a field
+name.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>PrefixQuery’s constructor collects and validates the attributes.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>Since this is an inside-out class, we’ll need a destructor:</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>The equals() method determines whether two Queries are logically equivalent:</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>The last thing we’ll need is a make_compiler() factory method which kicks out
+a subclass of <a href="../../../Lucy/Search/Compiler.html">Compiler</a>.</p>
+<pre><code>Code example for C is missing</code></pre>
+<h4>PrefixCompiler</h4>
+<p>PrefixQuery’s make_compiler() method will be called internally at search-time
+by objects which subclass <a href="../../../Lucy/Search/Searcher.html">Searcher</a> – such as
+<a href="../../../Lucy/Search/IndexSearcher.html">IndexSearchers</a>.</p>
+<p>A Searcher is associated with a particular collection of documents. These
+documents may all reside in one index, as with IndexSearcher, or they may be
+spread out across multiple indexes on one or more machines, as with
+LucyX::Remote::ClusterSearcher.</p>
+<p>Searcher objects have access to certain statistical information about the
+collections they represent; for instance, a Searcher can tell you how many
+documents are in the collection…</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>… or how many documents a specific term appears in:</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>Such information can be used by sophisticated Compiler implementations to
+assign more or less heft to individual queries or sub-queries. However, we’re
+not going to bother with weighting for this demo; we’ll just assign a fixed
+score of 1.0 to each matching document.</p>
+<p>We don’t need to write a constructor, as it will suffice to inherit new() from
+Lucy::Search::Compiler. The only method we need to implement for
+PrefixCompiler is make_matcher().</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>PrefixCompiler gets access to a <a href="../../../Lucy/Index/SegReader.html">SegReader</a>
+object when make_matcher() gets called. From the SegReader and its
+sub-components <a href="../../../Lucy/Index/LexiconReader.html">LexiconReader</a> and
+<a href="../../../Lucy/Index/PostingListReader.html">PostingListReader</a>, we acquire a
+<a href="../../../Lucy/Index/Lexicon.html">Lexicon</a>, scan through the Lexicon’s unique
+terms, and acquire a <a href="../../../Lucy/Index/PostingList.html">PostingList</a> for each
+term that matches our prefix.</p>
+<p>Each of these PostingList objects represents a set of documents which match
+the query.</p>
+<h4>PrefixMatcher</h4>
+<p>The Matcher subclass is the most involved.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>The doc ids must be in order, or some will be ignored; hence the <code>sort</code>
+above.</p>
+<p>In addition to the constructor and destructor, there are three methods that
+must be overridden.</p>
+<p>next() advances the Matcher to the next valid matching doc.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>get_doc_id() returns the current document id, or 0 if the Matcher is
+exhausted. (<a href="../../../Lucy/Docs/DocIDs.html">Document numbers</a> start at 1, so 0 is
+a sentinel.)</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>score() conveys the relevance score of the current match. We’ll just return a
+fixed score of 1.0:</p>
+<pre><code>Code example for C is missing</code></pre>
+<h3>Usage</h3>
+<p>To get a basic feel for PrefixQuery, insert the FlatQueryParser module
+described in <a href="../../../Lucy/Docs/Cookbook/CustomQueryParser.html">CustomQueryParser</a> (which supports
+PrefixQuery) into the search.cgi sample app.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>If you’re planning on using PrefixQuery in earnest, though, you may want to
+change up analyzers to avoid stemming, because stemming – another approach to
+prefix conflation – is not perfectly compatible with prefix searches.</p>
+<pre><code>Code example for C is missing</code></pre>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQueryParser.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQueryParser.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQueryParser.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,165 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Docs::Cookbook::CustomQueryParser</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a> » <a href="/docs/0.5.0/c/Lucy/Docs/Cookbook/">Cookbook</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Sample subclass of QueryParser.</h2>
+<p>Implement a custom search query language using a subclass of
+<a href="../../../Lucy/Search/QueryParser.html">QueryParser</a>.</p>
+<h3>The language</h3>
+<p>At first, our query language will support only simple term queries and phrases
+delimited by double quotes. For simplicity’s sake, it will not support
+parenthetical groupings, boolean operators, or prepended plus/minus. The
+results for all subqueries will be unioned together – i.e. joined using an OR
+– which is usually the best approach for small-to-medium-sized document
+collections.</p>
+<p>Later, we’ll add support for trailing wildcards.</p>
+<h3>Single-field parser</h3>
+<p>Our initial parser implentation will generate queries against a single fixed
+field, “content”, and it will analyze text using a fixed choice of English
+EasyAnalyzer. We won’t subclass Lucy::Search::QueryParser just yet.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>Some private helper subs for creating TermQuery and PhraseQuery objects will
+help keep the size of our main parse() subroutine down:</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>Our private _tokenize() method treats double-quote delimited material as a
+single token and splits on whitespace everywhere else.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>The main parsing routine creates an array of tokens by calling _tokenize(),
+runs the tokens through through the EasyAnalyzer, creates TermQuery or
+PhraseQuery objects according to how many tokens emerge from the
+EasyAnalyzer’s split() method, and adds each of the sub-queries to the primary
+ORQuery.</p>
+<pre><code>Code example for C is missing</code></pre>
+<h3>Multi-field parser</h3>
+<p>Most often, the end user will want their search query to match not only a
+single ‘content’ field, but also ‘title’ and so on. To make that happen, we
+have to turn queries such as this…</p>
+<pre><code>foo AND NOT bar
+</code></pre>
+<p>… into the logical equivalent of this:</p>
+<pre><code>(title:foo OR content:foo) AND NOT (title:bar OR content:bar)
+</code></pre>
+<p>Rather than continue with our own from-scratch parser class and write the
+routines to accomplish that expansion, we’re now going to subclass Lucy::Search::QueryParser
+and take advantage of some of its existing methods.</p>
+<p>Our first parser implementation had the “content” field name and the choice of
+English EasyAnalyzer hard-coded for simplicity, but we don’t need to do that
+once we subclass Lucy::Search::QueryParser. QueryParser’s constructor –
+which we will inherit, allowing us to eliminate our own constructor –
+requires a Schema which conveys field
+and Analyzer information, so we can just defer to that.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>We’re also going to jettison our _make_term_query() and _make_phrase_query()
+helper subs and chop our parse() subroutine way down. Our revised parse()
+routine will generate Lucy::Search::LeafQuery objects instead of TermQueries
+and PhraseQueries:</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>The magic happens in QueryParser’s expand() method, which walks the ORQuery
+object we supply to it looking for LeafQuery objects, and calls expand_leaf()
+for each one it finds. expand_leaf() performs field-specific analysis,
+decides whether each query should be a TermQuery or a PhraseQuery, and if
+multiple fields are required, creates an ORQuery which mults out e.g. <code>foo</code>
+into <code>(title:foo OR content:foo)</code>.</p>
+<h3>Extending the query language</h3>
+<p>To add support for trailing wildcards to our query language, we need to
+override expand_leaf() to accommodate PrefixQuery, while deferring to the
+parent class implementation on TermQuery and PhraseQuery.</p>
+<pre><code>Code example for C is missing</code></pre>
+<p>Ordinarily, those asterisks would have been stripped when running tokens
+through the EasyAnalyzer – query strings containing “foo*” would produce
+TermQueries for the term “foo”. Our override intercepts tokens with trailing
+asterisks and processes them as PrefixQueries before <code>SUPER::expand_leaf</code> can
+discard them, so that a search for “foo*” can match “food”, “foosball”, and so
+on.</p>
+<h3>Usage</h3>
+<p>Insert our custom parser into the search.cgi sample app to get a feel for how
+it behaves:</p>
+<pre><code>Code example for C is missing</code></pre>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/FastUpdates.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/FastUpdates.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/FastUpdates.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,251 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Docs::Cookbook::FastUpdates</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a> » <a href="/docs/0.5.0/c/Lucy/Docs/Cookbook/">Cookbook</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Near real-time index updates</h2>
+<p>While index updates are fast on average, worst-case update performance may be
+significantly slower. To make index updates consistently quick, we must
+manually intervene to control the process of index segment consolidation.</p>
+<h3>The problem</h3>
+<p>Ordinarily, modifying an index is cheap. New data is added to new segments,
+and the time to write a new segment scales more or less linearly with the
+number of documents added during the indexing session.</p>
+<p>Deletions are also cheap most of the time, because we don’t remove documents
+immediately but instead mark them as deleted, and adding the deletion mark is
+cheap.</p>
+<p>However, as new segments are added and the deletion rate for existing segments
+increases, search-time performance slowly begins to degrade. At some point,
+it becomes necessary to consolidate existing segments, rewriting their data
+into a new segment.</p>
+<p>If the recycled segments are small, the time it takes to rewrite them may not
+be significant. Every once in a while, though, a large amount of data must be
+rewritten.</p>
+<h3>Procrastinating and playing catch-up</h3>
+<p>The simplest way to force fast index updates is to avoid rewriting anything.</p>
+<p>Indexer relies upon <a href="../../../Lucy/Index/IndexManager.html">IndexManager</a>’s
+<a href="../../../Lucy/Index/IndexManager.html#func_Recycle">Recycle()</a> method to tell it which segments should
+be consolidated. If we subclass IndexManager and override the method so that
+it always returns an empty array, we get consistently quick performance:</p>
+<pre><code class="language-c">Vector*
+NoMergeManager_Recycle_IMP(IndexManager *self, PolyReader *reader,
+ DeletionsWriter *del_writer, int64_t cutoff,
+ bool optimize) {
+ return Vec_new(0);
+}
+
+void
+do_index(Obj *index) {
+ CFCClass *klass = Class_singleton("NoMergeManager", INDEXMANAGER);
+ Class_Override(klass, (cfish_method_t)NoMergeManager_Recycle_IMP,
+ LUCY_IndexManager_Recycle_OFFSET);
+
+ IndexManager *manager = (IndexManager*)Class_Make_Obj(klass);
+ IxManager_init(manager, NULL, NULL);
+
+ Indexer *indexer = Indexer_new(NULL, index, manager, 0);
+ ...
+ Indexer_Commit(indexer);
+
+ DECREF(indexer);
+ DECREF(manager);
+}
+</code></pre>
+<p>However, we can’t procrastinate forever. Eventually, we’ll have to run an
+ordinary, uncontrolled indexing session, potentially triggering a large
+rewrite of lots of small and/or degraded segments:</p>
+<pre><code class="language-c">void
+do_index(Obj *index) {
+ Indexer *indexer = Indexer_new(NULL, index, NULL /* manager */, 0);
+ ...
+ Indexer_Commit(indexer);
+ DECREF(indexer);
+}
+</code></pre>
+<h3>Acceptable worst-case update time, slower degradation</h3>
+<p>Never merging anything at all in the main indexing process is probably
+overkill. Small segments are relatively cheap to merge; we just need to guard
+against the big rewrites.</p>
+<p>Setting a ceiling on the number of documents in the segments to be recycled
+allows us to avoid a mass proliferation of tiny, single-document segments,
+while still offering decent worst-case update speed:</p>
+<pre><code class="language-c">Vector*
+LightMergeManager_Recycle_IMP(IndexManager *self, PolyReader *reader,
+ DeletionsWriter *del_writer, int64_t cutoff,
+ bool optimize) {
+ IndexManager_Recycle_t super_recycle
+ = SUPER_METHOD_PTR(IndexManager, LUCY_IndexManager_Recycle);
+ Vector *seg_readers = super_recycle(self, reader, del_writer, cutoff,
+ optimize);
+ Vector *small_segments = Vec_new(0);
+
+ for (size_t i = 0, max = Vec_Get_Size(seg_readers); i < max; i++) {
+ SegReader *seg_reader = (SegReader*)Vec_Fetch(seg_readers, i);
+
+ if (SegReader_Doc_Max(seg_reader) < 10) {
+ Vec_Push(small_segments, INCREF(seg_reader));
+ }
+ }
+
+ DECREF(seg_readers);
+ return small_segments;
+}
+</code></pre>
+<p>However, we still have to consolidate every once in a while, and while that
+happens content updates will be locked out.</p>
+<h3>Background merging</h3>
+<p>If it’s not acceptable to lock out updates while the index consolidation
+process runs, the alternative is to move the consolidation process out of
+band, using <a href="../../../Lucy/Index/BackgroundMerger.html">BackgroundMerger</a>.</p>
+<p>It’s never safe to have more than one Indexer attempting to modify the content
+of an index at the same time, but a BackgroundMerger and an Indexer can
+operate simultaneously:</p>
+<pre><code class="language-c">typedef struct {
+ Obj *index;
+ Doc *doc;
+} Context;
+
+static void
+S_index_doc(void *arg) {
+ Context *ctx = (Context*)arg;
+
+ CFCClass *klass = Class_singleton("LightMergeManager", INDEXMANAGER);
+ Class_Override(klass, (cfish_method_t)LightMergeManager_Recycle_IMP,
+ LUCY_IndexManager_Recycle_OFFSET);
+
+ IndexManager *manager = (IndexManager*)Class_Make_Obj(klass);
+ IxManager_init(manager, NULL, NULL);
+
+ Indexer *indexer = Indexer_new(NULL, ctx->index, manager, 0);
+ Indexer_Add_Doc(indexer, ctx->doc, 1.0);
+ Indexer_Commit(indexer);
+
+ DECREF(indexer);
+ DECREF(manager);
+}
+
+void indexing_process(Obj *index, Doc *doc) {
+ Context ctx;
+ ctx.index = index;
+ ctx.doc = doc;
+
+ for (int i = 0; i < max_retries; i++) {
+ Err *err = Err_trap(S_index_doc, &ctx);
+ if (!err) { break; }
+ if (!Err_is_a(err, LOCKERR)) {
+ RETHROW(err);
+ }
+ WARN("Couldn't get lock (%d retries)", i);
+ DECREF(err);
+ }
+}
+
+void
+background_merge_process(Obj *index) {
+ IndexManager *manager = IxManager_new(NULL, NULL);
+ IxManager_Set_Write_Lock_Timeout(manager, 60000);
+
+ BackgroundMerger bg_merger = BGMerger_new(index, manager);
+ BGMerger_Commit(bg_merger);
+
+ DECREF(bg_merger);
+ DECREF(manager);
+}
+</code></pre>
+<p>The exception handling code becomes useful once you have more than one index
+modification process happening simultaneously. By default, Indexer tries
+several times to acquire a write lock over the span of one second, then holds
+it until <a href="../../../Lucy/Index/Indexer.html#func_Commit">Commit()</a> completes. BackgroundMerger handles
+most of its work
+without the write lock, but it does need it briefly once at the beginning and
+once again near the end. Under normal loads, the internal retry logic will
+resolve conflicts, but if it’s not acceptable to miss an insert, you probably
+want to catch <a href="../../../Lucy/Store/LockErr.html">LockErr</a> exceptions thrown by Indexer. In
+contrast, a LockErr from BackgroundMerger probably just needs to be logged.</p>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DevGuide.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DevGuide.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DevGuide.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,124 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Docs::DevGuide</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Quick-start guide to hacking on Apache Lucy.</h2>
+<p>The Apache Lucy code base is organized into roughly four layers:</p>
+<ul>
+<li>Charmonizer - compiler and OS configuration probing.</li>
+<li>Clownfish - header files.</li>
+<li>C - implementation files.</li>
+<li>Host - binding language.</li>
+</ul>
+<p>Charmonizer is a configuration prober which writes a single header file,
+“charmony.h”, describing the build environment and facilitating
+cross-platform development. It’s similar to Autoconf or Metaconfig, but
+written in pure C.</p>
+<p>The “.cfh” files within the Lucy core are Clownfish header files.
+Clownfish is a purpose-built, declaration-only language which superimposes
+a single-inheritance object model on top of C which is specifically
+designed to co-exist happily with variety of “host” languages and to allow
+limited run-time dynamic subclassing. For more information see the
+Clownfish docs, but if there’s one thing you should know about Clownfish OO
+before you start hacking, it’s that method calls are differentiated from
+functions by capitalization:</p>
+<pre><code>Indexer_Add_Doc <-- Method, typically uses dynamic dispatch.
+Indexer_add_doc <-- Function, always a direct invocation.
+</code></pre>
+<p>The C files within the Lucy core are where most of Lucy’s low-level
+functionality lies. They implement the interface defined by the Clownfish
+header files.</p>
+<p>The C core is intentionally left incomplete, however; to be usable, it must
+be bound to a “host” language. (In this context, even C is considered a
+“host” which must implement the missing pieces and be “bound” to the core.)
+Some of the binding code is autogenerated by Clownfish on a spec customized
+for each language. Other pieces are hand-coded in either C (using the
+host’s C API) or the host language itself.</p>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DocIDs.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DocIDs.html (added)
+++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DocIDs.html Wed Sep 28 12:07:48 2016
@@ -0,0 +1,108 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
+ <title>Lucy::Docs::DocIDs</title>
+ <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css">
+ </head>
+
+ <body>
+
+ <div id="lucy-rigid_wrapper">
+
+ <div id="lucy-top" class="container_16 lucy-white_box_3d">
+
+ <div id="lucy-logo_box" class="grid_8">
+ <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucy™"></a>
+ </div> <!-- lucy-logo_box -->
+
+ <div #id="lucy-top_nav_box" class="grid_8">
+ <div id="lucy-top_nav_bar" class="container_8">
+ <ul>
+ <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li>
+ <li><a href="http://www.apache.org/licenses/" title="License">License</a></li>
+ <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li>
+ <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li>
+ <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li>
+ </ul>
+ </div> <!-- lucy-top_nav_bar -->
+ <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a></p>
+ <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get">
+ <input value="*.apache.org" name="sitesearch" type="hidden"/>
+ <input type="text" name="q" id="query" style="width:85%">
+ <input type="submit" id="submit" value="Search">
+ </form>
+ </div> <!-- lucy-top_nav_box -->
+
+ <div class="clear"></div>
+
+ </div> <!-- lucy-top -->
+
+ <div id="lucy-main_content" class="container_16 lucy-white_box_3d">
+
+ <div class="grid_4" id="lucy-left_nav_box">
+ <h6>About</h6>
+ <ul>
+ <li><a href="/">Welcome</a></li>
+ <li><a href="/clownfish.html">Clownfish</a></li>
+ <li><a href="/faq.html">FAQ</a></li>
+ <li><a href="/people.html">People</a></li>
+ </ul>
+ <h6>Resources</h6>
+ <ul>
+ <li><a href="/download.html">Download</a></li>
+ <li><a href="/mailing_lists.html">Mailing Lists</a></li>
+ <li><a href="/docs/">Documentation</a></li>
+ <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li>
+ <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li>
+ <li><a href="/version_control.html">Version Control</a></li>
+ </ul>
+ <h6>Related Projects</h6>
+ <ul>
+ <li><a href="http://lucene.apache.org/core/">Lucene</a></li>
+ <li><a href="http://dezi.org/">Dezi</a></li>
+ <li><a href="http://lucene.apache.org/solr/">Solr</a></li>
+ <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li>
+ <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li>
+ </ul>
+ </div> <!-- lucy-left_nav_box -->
+
+ <div id="lucy-main_content_box" class="grid_9">
+ <div class="c-api">
+<h2>Characteristics of Apache Lucy document ids.</h2>
+<h3>Document ids are signed 32-bit integers</h3>
+<p>Document ids in Apache Lucy start at 1. Because 0 is never a valid doc id, we
+can use it as a sentinel value:</p>
+<pre><code>Code example for C is missing</code></pre>
+<h3>Document ids are ephemeral</h3>
+<p>The document ids used by Lucy are associated with a single index
+snapshot. The moment an index is updated, the mapping of document ids to
+documents is subject to change.</p>
+<p>Since IndexReader objects represent a point-in-time view of an index, document
+ids are guaranteed to remain static for the life of the reader. However,
+because they are not permanent, Lucy document ids cannot be used as
+foreign keys to locate records in external data sources. If you truly need a
+primary key field, you must define it and populate it yourself.</p>
+<p>Furthermore, the order of document ids does not tell you anything about the
+sequence in which documents were added to the index.</p>
+</div>
+
+ </div> <!-- lucy-main_content_box -->
+ <div class="clear"></div>
+
+ </div> <!-- lucy-main_content -->
+
+ <div id="lucy-copyright" class="container_16">
+ <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
+ <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+ <br/>
+ Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
+ Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
+ respective owners.
+ </p>
+ </div> <!-- lucy-copyright -->
+
+ </div> <!-- lucy-rigid_wrapper -->
+
+ </body>
+</html>