You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2008/11/28 14:27:24 UTC

[Solr Wiki] Update of "TermsComponent" by GrantIngersoll

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by GrantIngersoll:
http://wiki.apache.org/solr/TermsComponent

New page:
= Introduction =

The !TermsComponent !SearchComponent is a simple plugin that provides access to Lucene's term dictionary (the TermEnum.)  This could be useful for doing auto-suggest or other things that operate at the term level instead of the search or document level.  Currently, the !TermsComponent only provides !TermEnum access and not !TermDocs or position information.  This kind of lookup should be very fast.

See http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/TermEnum.html

See http://lucene.apache.org/java/2_4_0/fileformats.html for what Lucene's file formats look like.


= How it Works =

To use the !TermsComponent, users can pass in a variety of options in order to get access to the TermEnum.  The supported parameters are available in the org.apache.solr.common.params.TermsParams class.  These params are:

 * terms={true|false} - Turn on the !TermsComponent
 * terms.fl={FIELD NAME} - Required. The name of the field to get the terms from.
 * terms.lower={The lower bound term} - Optional.  The term to start at.  If not specified, the empty string is used, meaning start at the beginning of the field.
 * terms.upper={The upper bound term} - Either upper, terms.rows, rows must be set.  The term to stop at.
 * terms.upr.incl={true|false} - Optional.  Include the upper bound term in the result set.  Default is false.
 * terms.lwr.incl={true|false} - Optional.  Include the lower bound term in the result set.  Default is true.
 * terms.rows={integer} - Either upper, terms.rows, rows must be set.  The number of results to return.  If not specified, looks for rows (CommonParams.ROWS).  If that is not specified, default is 10  

The output is a list of the terms and their document frequency values.  Again, see http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/TermEnum.html

= Examples =

The following examples use the Solr tutorial example located in the <Solr>/example directory.

== Simple ==
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name
}}}

Get back the first ten terms in the name field. 

Results:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="0">5</str>
 <str name="1">15</str>
 <str name="11">5</str>
 <str name="120">5</str>
 <str name="133">5</str>
 <str name="184">15</str>
 <str name="19">5</str>
 <str name="1900">5</str>
 <str name="2">15</str>
 <str name="20">5</str>
</lst>
</response>
}}}

== Lower ==

URL: 
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&indent=true
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="a">8</str>
 <str name="adata">5</str>

 <str name="all">5</str>
 <str name="allinon">5</str>
 <str name="amber">1</str>
 <str name="appl">5</str>
 <str name="asus">5</str>
 <str name="ata">5</str>

 <str name="ati">5</str>
 <str name="b">5</str>
</lst>
</response>
}}}

== Lower, Upper ==

URL:
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&terms.upper=b&indent=true
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="a">8</str>
 <str name="adata">5</str>

 <str name="all">5</str>
 <str name="allinon">5</str>
 <str name="amber">1</str>
 <str name="appl">5</str>
 <str name="asus">5</str>
 <str name="ata">5</str>

 <str name="ati">5</str>
</lst>
</response>
}}}

Notice the "b" got dropped

== Exclusive of Lower Bound ==

URL:
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&terms.upper=b&terms.lwr.incl=false&indent=true
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="adata">5</str>
 <str name="all">5</str>

 <str name="allinon">5</str>
 <str name="amber">1</str>
 <str name="appl">5</str>
 <str name="asus">5</str>
 <str name="ata">5</str>
 <str name="ati">5</str>

</lst>
</response>
}}}


== Rows == 

URL:
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&terms.upper=b&indent=true&terms.rows=2
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
</lst>
<lst name="terms">
 <str name="a">8</str>
 <str name="adata">5</str>

</lst>
</response>

}}}