You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-commits@db.apache.org by ch...@apache.org on 2014/06/30 22:05:42 UTC
svn commit: r1606904 - in /db/derby/docs/trunk/src/tools:
rtoolsoptlucenecreate.dita rtoolsoptlucenelist.dita
rtoolsoptlucenequery.dita rtoolsoptluceneupdate.dita
Author: chaase3
Date: Mon Jun 30 20:05:42 2014
New Revision: 1606904
URL: http://svn.apache.org/r1606904
Log:
DERBY-6564 Document the experimental, optional LuceneSupport tool.
Modified 4 Tools Guide topics.
Patch: DERBY-6564-3.diff
Modified:
db/derby/docs/trunk/src/tools/rtoolsoptlucenecreate.dita
db/derby/docs/trunk/src/tools/rtoolsoptlucenelist.dita
db/derby/docs/trunk/src/tools/rtoolsoptlucenequery.dita
db/derby/docs/trunk/src/tools/rtoolsoptluceneupdate.dita
Modified: db/derby/docs/trunk/src/tools/rtoolsoptlucenecreate.dita
URL: http://svn.apache.org/viewvc/db/derby/docs/trunk/src/tools/rtoolsoptlucenecreate.dita?rev=1606904&r1=1606903&r2=1606904&view=diff
==============================================================================
--- db/derby/docs/trunk/src/tools/rtoolsoptlucenecreate.dita (original)
+++ db/derby/docs/trunk/src/tools/rtoolsoptlucenecreate.dita Mon Jun 30 20:05:42 2014
@@ -40,7 +40,7 @@ possible:</p>
SCHEMANAME VARCHAR( 128 ),
TABLENAME VARCHAR( 128 ),
TEXTCOLUMN VARCHAR( 128 ),
- ANALYZERMAKER VARCHAR( 32672 ),
+ INDEXDESCRIPTORMAKER VARCHAR( 32672 ),
KEYCOLUMNS VARCHAR( 32672 ) ...
)</codeblock>
<p>The procedure parameters are as follows:</p>
@@ -52,15 +52,31 @@ it.</li>
case-insensitive).</li>
<li><codeph>TEXTCOLUMN</codeph>: The SQL identifier of the text column being
indexed (also case-insensitive). The column must have a character datatype.</li>
-<li><codeph>ANALYZERMAKER</codeph>: If the argument is not null, this is the
-full name of a zero-argument static, public method which creates an analyzer. If
-the argument is null, the index is created via the default analyzer maker,
-<codeph>org.apache.derby.optional.api.LuceneUtils.defaultAnalyzer</codeph>. The
-default analyzer maker attempts to find a Lucene-supplied analyzer matching the
-default language of the database. Matches are found for the languages listed in
-the following table. Note that the Chinese analyzer was deprecated, so for
-Chinese, the plugin uses the <codeph>StandardAnalyzer</codeph> instead.
-<table frame="all">
+<li><codeph>INDEXDESCRIPTORMAKER</codeph>: If the argument is not null, this is
+the full name of a zero-argument static, public method which creates an
+<codeph>org.apache.derby.optional.api.IndexDescriptor</codeph>. If the argument
+is null, the index is created using the default maker method,
+<codeph>org.apache.derby.optional.api.LuceneUtils.defaultIndexDescriptor</codeph>.
+An <codeph>org.apache.derby.optional.api.IndexDescriptor</codeph> specifies the
+following:
+<ul>
+<li>The analyzer to use when parsing text into indexable terms</li>
+<li>The names of the indexed fields which can be queried later on</li>
+<li>The subclass of
+<codeph>org.apache.lucene.queryparser.classic.QueryParser</codeph> which should
+be used when querying the index later on</li>
+</ul>
+<p>The default <codeph>org.apache.derby.optional.api.IndexDescriptor</codeph>
+supplies one field name (<codeph>luceneTextField</codeph>) along with an
+instance of
+<codeph>org.apache.lucene.queryparser.classic.MultiFieldQueryParser</codeph> as
+its <codeph>QueryParser</codeph>. In addition, the default
+<codeph>org.apache.derby.optional.api.IndexDescriptor</codeph> attempts to find
+a Lucene-supplied analyzer matching the default language of the database.
+Matches are found for the languages listed in the following table. Note that the
+Chinese analyzer was deprecated, so for Chinese, the plugin uses the
+<codeph>StandardAnalyzer</codeph> instead.</p>
+<p><table frame="all">
<title>Language codes supported by the Lucene plugin</title>
<desc>This table lists the languages and corresponding language codes supported by the Lucene plugin.</desc>
<tgroup cols="2" colsep="1" rowsep="1">
@@ -196,6 +212,7 @@ Chinese, the plugin uses the <codeph>Sta
</tbody>
</tgroup>
</table>
+</p>
<p><ph conref="../conrefs.dita#prod/productshortname"></ph> supplies another
utility method which instantiates the default Lucene analyzer; this utility
method is called
@@ -221,15 +238,15 @@ greater detail.</p>
</section>
<section><title>Example</title>
<codeblock><b>-- index the POEMTEXT column of the POEMS table,
--- using its primary key and the default, locale-sensitive analyzer
+-- using its primary key and the default IndexDescriptor maker
CALL LUCENESUPPORT.CREATEINDEX( 'ruth', 'poems', 'poemText', null );
-- index the POEMVIEW view, using POEMID and VERSIONSTAMP as keys
--- and Lucene's StandardAnalyzer
+-- and a custom IndexDescriptor
CALL LUCENESUPPORT.CREATEINDEX
(
'ruth', 'poemView', 'poemText',
- 'org.apache.derby.optional.api.LuceneUtils.standardAnalyzer',
+ 'myapp.MyIndexDescriptor.makeMe',
'poemID', 'versionStamp'
);</b></codeblock>
</section>
Modified: db/derby/docs/trunk/src/tools/rtoolsoptlucenelist.dita
URL: http://svn.apache.org/viewvc/db/derby/docs/trunk/src/tools/rtoolsoptlucenelist.dita?rev=1606904&r1=1606903&r2=1606904&view=diff
==============================================================================
--- db/derby/docs/trunk/src/tools/rtoolsoptlucenelist.dita (original)
+++ db/derby/docs/trunk/src/tools/rtoolsoptlucenelist.dita Mon Jun 30 20:05:42 2014
@@ -38,7 +38,7 @@ RETURNS TABLE
LASTUPDATED TIMESTAMP,
LUCENEVERSION VARCHAR( 20 ),
ANALYZER VARCHAR( 32672 ),
- ANALYZYERMAKER VARCHAR( 32672 )
+ INDEXDESCRIPTORMAKER VARCHAR( 32672 )
)</codeblock>
</section>
<section><title>Example</title>
Modified: db/derby/docs/trunk/src/tools/rtoolsoptlucenequery.dita
URL: http://svn.apache.org/viewvc/db/derby/docs/trunk/src/tools/rtoolsoptlucenequery.dita?rev=1606904&r1=1606903&r2=1606904&view=diff
==============================================================================
--- db/derby/docs/trunk/src/tools/rtoolsoptlucenequery.dita (original)
+++ db/derby/docs/trunk/src/tools/rtoolsoptlucenequery.dita Mon Jun 30 20:05:42 2014
@@ -33,7 +33,6 @@ shape:</p>
<codeblock>$SCHEMANAME.$TABLENAME__TEXTCOL
(
QUERY VARCHAR( 32672 ),
- QUERYPARSERMAKER VARCHAR( 32672 ),
WINDOWSIZE INT,
SCORECEILING REAL
)
@@ -51,17 +50,6 @@ RETURNS TABLE
description of the <xref format="html"
href="https://builds.apache.org/job/Lucene-Artifacts-trunk/javadoc/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description"
scope="external">Lucene query language</xref>.</li>
-<li><codeph>QUERYPARSERMAKER</codeph>: This argument provides directions for query-parsing
-the <codeph>QUERY</codeph>. If the argument is not null, it is the name of a
-public, static method with the following signature:
-<codeblock>org.apache.lucene.queryparser.classic.QueryParser $methodName
-(
- org.apache.lucene.util.Version version,
- java.lang.String fieldName,
- org.apache.lucene.analysis.Analyzer analyzer
-)</codeblock>
-<p>If the argument is null, query-parsing is performed by the default Lucene
-<codeph>QueryParser</codeph>.</p></li>
<li><codeph>WINDOWSIZE</codeph>: This is the maximum number of rows (matches) to
return.</li>
<li><codeph>SCORECEILING</codeph>: This causes Lucene to return only rows whose
@@ -70,6 +58,8 @@ score is less than this number. <codeph>
result into windows. See the example below. A value of NULL means "return the
best WINDOWSIZE matches".</li>
</ul>
+<p>Remember that when the index was created, the application specified how the
+query should be parsed. </p>
<p>In the returned result set, the key columns join back to the original table
or view, and they identify which row of that table/view holds the scored text.
The other columns in the returned result set have the following meanings:</p>
@@ -94,7 +84,6 @@ from table
us.presidentsSpeeches__speechText
(
'When in the course of human events',
- null,
3,
null
)
@@ -109,24 +98,9 @@ from table
us.presidentsSpeeches__speechText
(
'When in the course of human events',
- null,
4,
1.0
)
-) t;
-
--- Selects the primary key and score for the best 100 matches, given a
--- custom query language.
-select presidentID, speechID, score
-from table
-(
- us.presidentsSpeeches__speechText, score
- (
- 'When AND course AND human AND events',
- 'com.mytools.MyQueryLanguage.MyQueryParser.create',
- 100,
- null
- )
) t;</b></codeblock>
</section>
</refbody>
Modified: db/derby/docs/trunk/src/tools/rtoolsoptluceneupdate.dita
URL: http://svn.apache.org/viewvc/db/derby/docs/trunk/src/tools/rtoolsoptluceneupdate.dita?rev=1606904&r1=1606903&r2=1606904&view=diff
==============================================================================
--- db/derby/docs/trunk/src/tools/rtoolsoptluceneupdate.dita (original)
+++ db/derby/docs/trunk/src/tools/rtoolsoptluceneupdate.dita Mon Jun 30 20:05:42 2014
@@ -36,10 +36,11 @@ reindexes the column across the whole ta
SCHEMANAME VARCHAR( 128 ),
TABLENAME VARCHAR( 128 ),
TEXTCOLUMN VARCHAR( 128 ),
- ANALYZERMAKER VARCHAR( 32672 )
+ INDEXDESCRIPTORMAKER VARCHAR( 32672 )
)</b></codeblock>
<p>The first three arguments identify the column to be reindexed. The last
-argument lets you override how the text is broken into indexable terms.</p>
+argument lets you override how the text is indexed and how queries are
+parsed.</p>
<p>This release of the <codeph>luceneSupport</codeph> tool does not support the
incremental reindexing of data. Updating the index is a bulk operation, which
reindexes an entire data set. For this reason, this release of the
@@ -59,7 +60,7 @@ of fuzziness in query results</li>
CALL LUCENESUPPORT.UPDATEINDEX
(
'ruth', 'poemView', 'poemText',
- 'com.mytools.MyAnalyzer.create',
+ 'myapp.MyIndexDescriptor.makeMe',
);</b></codeblock>
</section></refbody>
</reference>