You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by "Hoss Man (Confluence)" <co...@apache.org> on 2013/07/20 02:52:00 UTC

[CONF] Apache Solr Reference Guide > Field Type Definitions and Properties

Space: Apache Solr Reference Guide (https://cwiki.apache.org/confluence/display/solr)
Page: Field Type Definitions and Properties (https://cwiki.apache.org/confluence/display/solr/Field+Type+Definitions+and+Properties)

Change Comment:
---------------------------------------------------------------------
add docValuesFormat and a note about upgrading if you don't use the default codec

Edited by Hoss Man:
---------------------------------------------------------------------
A field type includes four types of information:
* The name of the field type
* An implementation class name
* If the field type is {{TextField}}, a description of the field analysis for the field type
* Field attributes

h2. Field Type Definitions in {{schema.xml}}
Field types are defined in {{schema.xml}}, with the {{types}} element. Each field type is defined between {{fieldType}} elements. Here is an example of a field type definition for a type called {{text_general}}:

{code:xml|borderStyle=solid|borderColor=#666666}
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
{code}

The first line in the example above contains the field type name, {{text_general}}, and the name of the implementing class, {{solr.TextField}}. The rest of the definition is about field analysis, described in [Understanding Analyzers, Tokenizers, and Filters].

The implementing class is responsible for making sure the field is handled correctly. In the class names in {{schema.xml}}, the string {{solr}} is shorthand for {{org.apache.solr.schema}} or {{org.apache.solr.analysis}}. Therefore, {{solr.TextField}} is really {{org.apache.solr.schema.TextField.}}.

h2. Field Type Properties

The field type {{class}} determines most of the behavior of a field type, but optional properties can also be defined. For example, the following definition of a date field type defines two properties, {{sortMissingLast}} and {{omitNorms}}.

{code:xml|borderStyle=solid|borderColor=#666666}
<fieldType name="date" class="solr.DateField"
        sortMissingLast="true" omitNorms="true"/>
{code}

Here are some commonly used properties. Most properties are either true or false. In addition, the filters or tokenizers defined for field analysis may have properties that can be defined.

{excerpt}
|| Field Property || Description || Values ||
| indexed | If true, the value of the field can be used in queries to retrieve matching documents | true or false |
| stored | If true, the actual value of the field can be retrieved by queries | true or false |
| sortMissingFirst \\
sortMissingLast | Control the placement of documents when a sort field is not present. As of Solr 3.5, these work for all numeric fields, including Trie and date fields. | true or false |
| multiValued | If true, indicates that a single document might contain multiple values for this field type | true or false |
| positionIncrementGap | For multivalued fields, specifies a distance between multiple values, which prevents spurious phrase matches | integer |
| omitNorms | If true, omits the norms associated with this field (this disables length normalization and index-time boosting for the field, and saves some memory). Defaults to true for all primitive (non-analyzed) field types, such as int, float, data, bool, and string.  Only full-text fields or fields that need an index-time boost need norms. | true or false |
| omitTermFreqAndPositions | If true, omits term frequency, positions, and payloads from postings for this field. This can be a performance boost for fields that don't require that information. It also reduces the storage space required for the index. Queries that rely on position that are issued on a field with this option will silently fail to find documents. This property defaults to true for all fields that are not text fields. | true or false |
| autoGeneratePhraseQueries | For text fields. If true, Solr automatically generates phrase queries for adjacent terms. If false, terms must be enclosed in double-quotes to be treated as phrases. | true or false |
| docValuesFormat | Defines a custom {{DocValuesFormat}} to use for fields of this type. This requires that a schema-aware codec, such as the {{SchemaCodecFactory}} has been configured in solrconfig.xml.  | n/a |
| postingsFormat | Defines a custom {{PostingsFormat}} to use for fields of this type. This requires that a schema-aware codec, such as the {{SchemaCodecFactory}} has been configured in solrconfig.xml.  | n/a |
{excerpt}

{info}Lucene index back-compatibility is only supported for the default codec.  If you choose to customize the {{positingsFormat}} or {{docValuesFormat}} in your schema.xml, upgrading to a future version of Solr may require you to either switch back to the default codec and optimize your index to rewrite it into the default codec before upgrading, or re-build your entire index from scratch after upgrading.
{info}

{scrollbar}


Stop watching space: https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=solr
Change email notification preferences: https://cwiki.apache.org/confluence/users/editmyemailsettings.action