You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by "Hoss Man (Confluence)" <co...@apache.org> on 2013/07/20 02:52:00 UTC
[CONF] Apache Solr Reference Guide > Field Type Definitions and
Properties
Space: Apache Solr Reference Guide (https://cwiki.apache.org/confluence/display/solr)
Page: Field Type Definitions and Properties (https://cwiki.apache.org/confluence/display/solr/Field+Type+Definitions+and+Properties)
Change Comment:
---------------------------------------------------------------------
add docValuesFormat and a note about upgrading if you don't use the default codec
Edited by Hoss Man:
---------------------------------------------------------------------
A field type includes four types of information:
* The name of the field type
* An implementation class name
* If the field type is {{TextField}}, a description of the field analysis for the field type
* Field attributes
h2. Field Type Definitions in {{schema.xml}}
Field types are defined in {{schema.xml}}, with the {{types}} element. Each field type is defined between {{fieldType}} elements. Here is an example of a field type definition for a type called {{text_general}}:
{code:xml|borderStyle=solid|borderColor=#666666}
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
{code}
The first line in the example above contains the field type name, {{text_general}}, and the name of the implementing class, {{solr.TextField}}. The rest of the definition is about field analysis, described in [Understanding Analyzers, Tokenizers, and Filters].
The implementing class is responsible for making sure the field is handled correctly. In the class names in {{schema.xml}}, the string {{solr}} is shorthand for {{org.apache.solr.schema}} or {{org.apache.solr.analysis}}. Therefore, {{solr.TextField}} is really {{org.apache.solr.schema.TextField.}}.
h2. Field Type Properties
The field type {{class}} determines most of the behavior of a field type, but optional properties can also be defined. For example, the following definition of a date field type defines two properties, {{sortMissingLast}} and {{omitNorms}}.
{code:xml|borderStyle=solid|borderColor=#666666}
<fieldType name="date" class="solr.DateField"
sortMissingLast="true" omitNorms="true"/>
{code}
Here are some commonly used properties. Most properties are either true or false. In addition, the filters or tokenizers defined for field analysis may have properties that can be defined.
{excerpt}
|| Field Property || Description || Values ||
| indexed | If true, the value of the field can be used in queries to retrieve matching documents | true or false |
| stored | If true, the actual value of the field can be retrieved by queries | true or false |
| sortMissingFirst \\
sortMissingLast | Control the placement of documents when a sort field is not present. As of Solr 3.5, these work for all numeric fields, including Trie and date fields. | true or false |
| multiValued | If true, indicates that a single document might contain multiple values for this field type | true or false |
| positionIncrementGap | For multivalued fields, specifies a distance between multiple values, which prevents spurious phrase matches | integer |
| omitNorms | If true, omits the norms associated with this field (this disables length normalization and index-time boosting for the field, and saves some memory). Defaults to true for all primitive (non-analyzed) field types, such as int, float, data, bool, and string. Only full-text fields or fields that need an index-time boost need norms. | true or false |
| omitTermFreqAndPositions | If true, omits term frequency, positions, and payloads from postings for this field. This can be a performance boost for fields that don't require that information. It also reduces the storage space required for the index. Queries that rely on position that are issued on a field with this option will silently fail to find documents. This property defaults to true for all fields that are not text fields. | true or false |
| autoGeneratePhraseQueries | For text fields. If true, Solr automatically generates phrase queries for adjacent terms. If false, terms must be enclosed in double-quotes to be treated as phrases. | true or false |
| docValuesFormat | Defines a custom {{DocValuesFormat}} to use for fields of this type. This requires that a schema-aware codec, such as the {{SchemaCodecFactory}} has been configured in solrconfig.xml. | n/a |
| postingsFormat | Defines a custom {{PostingsFormat}} to use for fields of this type. This requires that a schema-aware codec, such as the {{SchemaCodecFactory}} has been configured in solrconfig.xml. | n/a |
{excerpt}
{info}Lucene index back-compatibility is only supported for the default codec. If you choose to customize the {{positingsFormat}} or {{docValuesFormat}} in your schema.xml, upgrading to a future version of Solr may require you to either switch back to the default codec and optimize your index to rewrite it into the default codec before upgrading, or re-build your entire index from scratch after upgrading.
{info}
{scrollbar}
Stop watching space: https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=solr
Change email notification preferences: https://cwiki.apache.org/confluence/users/editmyemailsettings.action