You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by ct...@apache.org on 2017/03/16 17:28:51 UTC
[04/26] lucene-solr:jira/solr-10290: SOLR-10290: Add .adoc files

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/transforming-result-documents.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/transforming-result-documents.adoc b/solr/solr-ref-guide/src/transforming-result-documents.adoc
new file mode 100644
index 0000000..8a305a9
--- /dev/null
+++ b/solr/solr-ref-guide/src/transforming-result-documents.adoc
@@ -0,0 +1,321 @@
+= Transforming Result Documents
+:page-shortname: transforming-result-documents
+:page-permalink: transforming-result-documents.html
+
+Document Transformers can be used to modify the information returned about each documents in the results of a query.
+
+[[TransformingResultDocuments-UsingDocumentTransformers]]
+== Using Document Transformers
+
+When executing a request, a document transformer can be used by including it in the `fl` parameter using square brackets, for example:
+
+[source,java]
+----
+fl=id,name,score,[shard]
+----
+
+Some transformers allow, or require, local parameters which can be specified as key value pairs inside the brackets:
+
+[source,java]
+----
+fl=id,name,score,[explain style=nl]
+----
+
+As with regular fields, you can change the key used when a Transformer adds a field to a document via a prefix:
+
+[source,java]
+----
+fl=id,name,score,my_val_a:[value v=42 t=int],my_val_b:[value v=7 t=float]
+----
+
+The sections below discuss exactly what these various transformers do.
+
+[[TransformingResultDocuments-AvailableTransformers]]
+== Available Transformers
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[value]-ValueAugmenterFactory
+
+[[TransformingResultDocuments-_value_-ValueAugmenterFactory]]
+=== `[value]` - ValueAugmenterFactory
+
+Modifies every document to include the exact same value, as if it were a stored field in every document:
+
+[source,java]
+----
+q=*:*&fl=id,greeting:[value v='hello']
+----
+
+The above query would produce results like the following:
+
+[source,xml]
+----
+<result name="response" numFound="32" start="0">
+  <doc>
+    <str name="id">1</str>
+    <str name="greeting">hello</str></doc>
+  </doc>
+  ...
+----
+
+By default, values are returned as a String, but a "```t```" parameter can be specified using a value of int, float, double, or date to force a specific return type:
+
+[source,java]
+----
+q=*:*&fl=id,my_number:[value v=42 t=int],my_string:[value v=42]
+----
+
+In addition to using these request parameters, you can configure additional named instances of ValueAugmenterFactory, or override the default behavior of the existing `[value]` transformer in your solrconfig.xml file:
+
+[source,xml]
+----
+<transformer name="mytrans2" class="org.apache.solr.response.transform.ValueAugmenterFactory" >
+  <int name="value">5</int>
+</transformer>
+<transformer name="value" class="org.apache.solr.response.transform.ValueAugmenterFactory" >
+  <double name="defaultValue">5</double>
+</transformer>
+----
+
+The "```value```" option forces an explicit value to always be used, while the "```defaultValue```" option provides a default that can still be overridden using the "```v```" and "```t```" local parameters.
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[explain]-ExplainAugmenterFactory
+
+[[TransformingResultDocuments-_explain_-ExplainAugmenterFactory]]
+=== `[explain]` - ExplainAugmenterFactory
+
+Augments each document with an inline explanation of its score exactly like the information available about each document in the debug section:
+
+[source,java]
+----
+q=features:cache&wt=json&fl=id,[explain style=nl]
+----
+
+Supported values for "```style```" are "```text```", and "```html```", and "nl" which returns the information as structured data:
+
+[source,json]
+----
+  "response":{"numFound":2,"start":0,"docs":[
+      {
+        "id":"6H500F0",
+        "[explain]":{
+          "match":true,
+          "value":1.052226,
+          "description":"weight(features:cache in 2) [DefaultSimilarity], result of:",
+          "details":[{
+...
+----
+
+A default style can be configured by specifying an "args" parameter in your configuration:
+
+[source,xml]
+----
+<transformer name="explain" class="org.apache.solr.response.transform.ExplainAugmenterFactory" >
+  <str name="args">nl</str>
+</transformer>
+----
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[child]-ChildDocTransformerFactory
+
+[[TransformingResultDocuments-_child_-ChildDocTransformerFactory]]
+=== `[child]` - ChildDocTransformerFactory
+
+This transformer returns all <<uploading-data-with-index-handlers.adoc#UploadingDatawithIndexHandlers-NestedChildDocuments,descendant documents>> of each parent document matching your query in a flat list nested inside the matching parent document. This is useful when you have indexed nested child documents and want to retrieve the child documents for the relevant parent documents for any type of search query.
+
+[source,java]
+----
+fl=id,[child parentFilter=doc_type:book childFilter=doc_type:chapter limit=100]
+----
+
+Note that this transformer can be used even though the query itself is not a <<other-parsers.adoc#OtherParsers-BlockJoinQueryParsers,Block Join query>>.
+
+When using this transformer, the `parentFilter` parameter must be specified, and works the same as in all Block Join Queries, additional optional parameters are:
+
+* `childFilter` - query to filter which child documents should be included, this can be particularly useful when you have multiple levels of hierarchical documents (default: all children)
+* `limit` - the maximum number of child documents to be returned per parent document (default: 10)
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[shard]-ShardAugmenterFactory
+
+[[TransformingResultDocuments-_shard_-ShardAugmenterFactory]]
+=== `[shard]` - ShardAugmenterFactory
+
+This transformer adds information about what shard each individual document came from in a distributed request.
+
+ShardAugmenterFactory does not support any request parameters, or configuration options.
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[docid]-DocIdAugmenterFactory
+
+[[TransformingResultDocuments-_docid_-DocIdAugmenterFactory]]
+=== `[docid]` - DocIdAugmenterFactory
+
+This transformer adds the internal Lucene document id to each document \u2013 this is primarily only useful for debugging purposes.
+
+DocIdAugmenterFactory does not support any request parameters, or configuration options.
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[elevated]and[excluded]
+
+[[TransformingResultDocuments-_elevated_and_excluded_]]
+=== `[elevated]` and `[excluded]`
+
+These transformers are available only when using the <<the-query-elevation-component.adoc#the-query-elevation-component,Query Elevation Component>>.
+
+* `[elevated]` annotates each document to indicate if it was elevated or not.
+* `[excluded]` annotates each document to indicate if it would have been excluded - this is only supported if you also use the `markExcludes` parameter.
+
+[source,java]
+----
+fl=id,[elevated],[excluded]&excludeIds=GB18030TEST&elevateIds=6H500F0&markExcludes=true
+----
+
+[source,json]
+----
+  "response":{"numFound":32,"start":0,"docs":[
+      {
+        "id":"6H500F0",
+        "[elevated]":true,
+        "[excluded]":false},
+      {
+        "id":"GB18030TEST",
+        "[elevated]":false,
+        "[excluded]":true},
+      {
+        "id":"SP2514N",
+        "[elevated]":false,
+        "[excluded]":false},
+...
+----
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[json]/[xml]
+
+[[TransformingResultDocuments-_json_xml_]]
+=== `[json]` / `[xml]`
+
+These transformers replace field value containing a string representation of a valid XML or JSON structure with the actual raw XML or JSON structure rather than just the string value. Each applies only to the specific writer, such that `[json]` only applies to `wt=json` and `[xml]` only applies to `wt=xml`.
+
+[source,java]
+----
+fl=id,source_s:[json]&wt=json
+----
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[subquery]
+
+[[TransformingResultDocuments-_subquery_]]
+=== `[subquery]`
+
+This transformer executes a separate query per transforming document passing document fields as an input for subquery parameters. It's usually used with `{!join}` and `{!parent}` query parsers, and is intended to be an improvement for `[child]`.
+
+* It must be given an unique name: `fl=*,children:[subquery]`
+* There might be a few of them, eg `fl=*,sons:[subquery],daughters:[subquery]`.
+* Every `[subquery]` occurrence adds a field into a result document with the given name, the value of this field is a document list, which is a result of executing subquery using document fields as an input.
+
+Here is how it looks like in various formats:
+
+[source,xml]
+----
+  <result name="response" numFound="2" start="0">
+      <doc>
+         <int name="id">1</int>
+         <arr name="title">
+            <str>vdczoypirs</str>
+         </arr>
+         <result name="children" numFound="1" start="0">
+            <doc>
+               <int name="id">2</int>
+               <arr name="title">
+                  <str>vdczoypirs</str>
+               </arr>
+            </doc>
+         </result>
+      </doc>
+  ...
+----
+
+[source,json]
+----
+"response":{
+  "numFound":2, "start":0,
+  "docs":[
+    {
+      "id":1,
+      "subject":["parentDocument"],
+      "title":["xrxvomgu"],
+      "children":{ 
+         "numFound":1, "start":0,
+         "docs":[
+            { "id":2,
+              "cat":["childDocument"]
+            }
+          ]
+    }},
+    {
+       "id":4,
+    ...
+----
+
+[source,java]
+----
+ SolrDocumentList subResults = (SolrDocumentList)doc.getFieldValue("children");
+----
+
+[[TransformingResultDocuments-SubqueryParametersShift]]
+==== Subquery Parameters Shift
+
+If subquery is declared as `fl=*,foo:[subquery]`, subquery parameters are prefixed with the given name and period. eg
+
+`q=*:*&fl=*,**foo**:[subquery]&**foo.**q=to be continued&**foo.**rows=10&**foo.**sort=id desc`
+
+[[TransformingResultDocuments-Documentfieldasaninputforsubqueryparams]]
+==== Document field as an input for subquery params
+
+It's necessary to pass some document field values as a parameter for subquery. It's supported via implicit *`row.__fieldname__`* parameter, and can be (but might not only) referred via Local Parameters syntax: `q=namne:john&fl=name,id,depts:[subquery]&depts.q={!terms f=id **v=$row.dept_id**}&depts.rows=10`
+
+Here departmens are retrieved per every employee in search result. We can say that it's like SQL `join ON emp.dept_id=dept.id`.
+
+Note, when document field has multiple values they are concatenated with comma by default, it can be changed by local parameter `foo:[subquery separator=' ']` , this mimics *`{!terms}`* to work smoothly with it.
+
+To log substituted subquery request parameters add the corresponding parameter names in `depts.logParamsList=q,fl,rows,**row.dept_id**`
+
+[[TransformingResultDocuments-CoresandCollectionsinSolrCloud]]
+==== Cores and Collections in SolrCloud
+
+Use `foo:[subquery fromIndex=departments]` to invoke subquery on another core on the same node, it's what *`{!join}`* does for non-SolrCloud mode. But in case of SolrCloud just (and only) explicitly specify its' native parameters like `collection, shards` for subquery, eg:
+
+`q=*:*&fl=*,foo:[subquery]&foo.q=cloud&**foo.collection**=departments`
+
+[IMPORTANT]
+====
+
+If subquery collection has a different unique key field name (let's say `foo_id` at contrast to `id` in primary collection), add the following parameters to accommodate this difference: `foo.fl=id:foo_id&foo.distrib.singlePass=true`. Otherwise you'll get `NullPoniterException` from `QueryComponent.mergeIds`.
+
+====
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[geo]-Geospatialformatter
+
+[[TransformingResultDocuments-_geo_-Geospatialformatter]]
+=== [geo] - Geospatial formatter
+
+Formats spatial data from a spatial field using a designated format type name. Two inner parameters are required: `f` for the field name, and `w` for the format name. Example: `geojson:[geo f=mySpatialField w=GeoJSON]`.
+
+Normally you'll simply be consistent in choosing the format type you want by setting the `format` attribute on the spatial field type to `WKT` or `GeoJSON` \u2013 see the section <<spatial-search.adoc#spatial-search,Spatial Search>> for more information. If you are consistent, it'll come out the way you stored it. This transformer offers a convenience to transform the spatial format to something different on retrieval.
+
+In addition, this feature is very useful with the `RptWithGeometrySpatialField` to avoid double-storage of the potentially large vector geometry. This transformer will detect that field type and fetch the geometry from an internal compact binary representation on disk (in docValues), and then format it as desired. As such, you needn't mark the field as stored, which would be redundant. In a sense this double-storage between docValues and stored-value storage isn't unique to spatial but with polygonal geometry it can be a lot of data, and furthermore you'd like to avoid storing it in a verbose format (like GeoJSON or WKT).
+
+// OLD_CONFLUENCE_ID: TransformingResultDocuments-[features]-LTRFeatureLoggerTransformerFactory
+
+[[TransformingResultDocuments-_features_-LTRFeatureLoggerTransformerFactory]]
+=== [features] - LTRFeatureLoggerTransformerFactory
+
+The "LTR" prefix stands for <<learning-to-rank.adoc#learning-to-rank,Learning To Rank>>. This transformer returns the values of features and it can be used for feature extraction and feature logging.
+
+[source,java]
+----
+fl=id,[features store=yourFeatureStore]
+----
+
+This will return the values of the features in the `yourFeatureStore` store.
+
+[source,java]
+----
+fl=id,[features]&rq={!ltr model=yourModel}
+----
+
+If you use `[features]` together with an Learning-To-Rank reranking query then the values of the features in the reranking model (`yourModel`) will be returned.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/uima-integration.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/uima-integration.adoc b/solr/solr-ref-guide/src/uima-integration.adoc
new file mode 100644
index 0000000..199257d
--- /dev/null
+++ b/solr/solr-ref-guide/src/uima-integration.adoc
@@ -0,0 +1,106 @@
+= UIMA Integration
+:page-shortname: uima-integration
+:page-permalink: uima-integration.html
+
+You can integrate the Apache Unstructured Information Management Architecture (https://uima.apache.org/[UIMA]) with Solr. UIMA lets you define custom pipelines of Analysis Engines that incrementally add metadata to your documents as annotations.
+
+For more information about Solr UIMA integration, see https://wiki.apache.org/solr/SolrUIMA.
+
+[[UIMAIntegration-ConfiguringUIMA]]
+== Configuring UIMA
+
+The SolrUIMA UpdateRequestProcessor is a custom update request processor that takes documents being indexed, sends them to a UIMA pipeline, and then returns the documents enriched with the specified metadata. To configure UIMA for Solr, follow these steps:
+
+1.  Copy `solr-uima-VERSION.jar` (under `/solr-VERSION/dist/`) and its libraries (under `contrib/uima/lib`) to a Solr libraries directory, or set `<lib/>` tags in `solrconfig.xml` appropriately to point to those jar files:
++
+[source,xml]
+----
+<lib dir="../../contrib/uima/lib" />
+<lib dir="../../dist/" regex="solr-uima-\d.*\.jar" />
+----
+2.  Modify `schema.xml`, adding your desired metadata fields specifying proper values for type, indexed, stored, and multiValued options. For example:
++
+[source,xml]
+----
+<field name="language" type="string" indexed="true" stored="true" required="false"/>
+<field name="concept" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
+<field name="sentence" type="text" indexed="true" stored="true" multiValued="true" required="false" />
+----
+3.  Add the following snippet to `solrconfig.xml`:
++
+[source,xml]
+----
+<updateRequestProcessorChain name="uima">
+  <processor class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
+    <lst name="uimaConfig">
+      <lst name="runtimeParameters">
+        <str name="keyword_apikey">VALID_ALCHEMYAPI_KEY</str>
+        <str name="concept_apikey">VALID_ALCHEMYAPI_KEY</str>
+        <str name="lang_apikey">VALID_ALCHEMYAPI_KEY</str>
+        <str name="cat_apikey">VALID_ALCHEMYAPI_KEY</str>
+        <str name="entities_apikey">VALID_ALCHEMYAPI_KEY</str>
+        <str name="oc_licenseID">VALID_OPENCALAIS_KEY</str>
+      </lst>
+      <str name="analysisEngine">/org/apache/uima/desc/OverridingParamsExtServicesAE.xml</str>
+      <!-- Set to true if you want to continue indexing even if text processing fails.
+           Default is false. That is, Solr throws RuntimeException and
+           never indexed documents entirely in your session. -->
+      <bool name="ignoreErrors">true</bool>
+      <!-- This is optional. It is used for logging when text processing fails.
+           If logField is not specified, uniqueKey will be used as logField.
+      <str name="logField">id</str>
+      -->
+      <lst name="analyzeFields">
+        <bool name="merge">false</bool>
+        <arr name="fields">
+          <str>text</str>
+        </arr>
+      </lst>
+      <lst name="fieldMappings">
+        <lst name="type">
+          <str name="name">org.apache.uima.alchemy.ts.concept.ConceptFS</str>
+          <lst name="mapping">
+            <str name="feature">text</str>
+            <str name="field">concept</str>
+          </lst>
+        </lst>
+        <lst name="type">
+          <str name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
+          <lst name="mapping">
+            <str name="feature">language</str>
+            <str name="field">language</str>
+          </lst>
+        </lst>
+        <lst name="type">
+          <str name="name">org.apache.uima.SentenceAnnotation</str>
+          <lst name="mapping">
+            <str name="feature">coveredText</str>
+            <str name="field">sentence</str>
+          </lst>
+        </lst>
+      </lst>
+    </lst>
+  </processor>
+  <processor class="solr.LogUpdateProcessorFactory" />
+  <processor class="solr.RunUpdateProcessorFactory" />
+</updateRequestProcessorChain>
+----
++
+[IMPORTANT]
+====
++
+`VALID_ALCHEMYAPI_KEY` is your AlchemyAPI Access Key. You need to register an AlchemyAPI Access key to use AlchemyAPI services: http://www.alchemyapi.com/api/register.html. `VALID_OPENCALAIS_KEY` is your Calais Service Key. You need to register a Calais Service key to use the Calais services: http://www.opencalais.com/apikey. `analysisEngine` must contain an AE descriptor inside the specified path in the classpath. `analyzeFields` must contain the input fields that need to be analyzed by UIMA. If `merge=true` then their content will be merged and analyzed only once. Field mapping describes which features of which types should go in a field.
++
+====
+4.  In your `solrconfig.xml` replace the existing default UpdateRequestHandler or create a new UpdateRequestHandler:
++
+[source,xml]
+----
+<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
+  <lst name="defaults">
+    <str name="update.chain">uima</str>
+  </lst>
+</requestHandler>
+----
+
+Once you are done with the configuration your documents will be automatically enriched with the specified fields when you index them.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/understanding-analyzers-tokenizers-and-filters.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/understanding-analyzers-tokenizers-and-filters.adoc b/solr/solr-ref-guide/src/understanding-analyzers-tokenizers-and-filters.adoc
new file mode 100644
index 0000000..0d6060f
--- /dev/null
+++ b/solr/solr-ref-guide/src/understanding-analyzers-tokenizers-and-filters.adoc
@@ -0,0 +1,48 @@
+= Understanding Analyzers, Tokenizers, and Filters
+:page-shortname: understanding-analyzers-tokenizers-and-filters
+:page-permalink: understanding-analyzers-tokenizers-and-filters.html
+:page-children: analyzers, about-tokenizers, about-filters, tokenizers, filter-descriptions, charfilterfactories, language-analysis, phonetic-matching, running-your-analyzer
+
+The following sections describe how Solr breaks down and works with textual data. There are three main concepts to understand: analyzers, tokenizers, and filters.
+
+<<analyzers.adoc#analyzers,Field analyzers>> are used both during ingestion, when a document is indexed, and at query time. An analyzer examines the text of fields and generates a token stream. Analyzers may be a single class or they may be composed of a series of tokenizer and filter classes.
+
+<<about-tokenizers.adoc#about-tokenizers,Tokenizers>> break field data into lexical units, or __tokens__.
+
+<<about-filters.adoc#about-filters,Filters>> examine a stream of tokens and keep them, transform or discard them, or create new ones. Tokenizers and filters may be combined to form pipelines, or __chains__, where the output of one is input to the next. Such a sequence of tokenizers and filters is called an _analyzer_ and the resulting output of an analyzer is used to match query results or build indices.
+
+// OLD_CONFLUENCE_ID: UnderstandingAnalyzers,Tokenizers,andFilters-UsingAnalyzers,Tokenizers,andFilters
+
+[[UnderstandingAnalyzers_Tokenizers_andFilters-UsingAnalyzers_Tokenizers_andFilters]]
+=== Using Analyzers, Tokenizers, and Filters
+
+Although the analysis process is used for both indexing and querying, the same analysis process need not be used for both operations. For indexing, you often want to simplify, or normalize, words. For example, setting all letters to lowercase, eliminating punctuation and accents, mapping words to their stems, and so on. Doing so can increase recall because, for example, "ram", "Ram" and "RAM" would all match a query for "ram". To increase query-time precision, a filter could be employed to narrow the matches by, for example, ignoring all-cap acronyms if you're interested in male sheep, but not Random Access Memory.
+
+The tokens output by the analysis process define the values, or __terms__, of that field and are used either to build an index of those terms when a new document is added, or to identify which documents contain the terms you are querying for.
+
+// OLD_CONFLUENCE_ID: UnderstandingAnalyzers,Tokenizers,andFilters-ForMoreInformation
+
+[[UnderstandingAnalyzers_Tokenizers_andFilters-ForMoreInformation]]
+=== For More Information
+
+These sections will show you how to configure field analyzers and also serves as a reference for the details of configuring each of the available tokenizer and filter classes. It also serves as a guide so that you can configure your own analysis classes if you have special needs that cannot be met with the included filters or tokenizers.
+
+*For Analyzers, see:*
+
+* <<analyzers.adoc#analyzers,Analyzers>>: Detailed conceptual information about Solr analyzers.
+* <<running-your-analyzer.adoc#running-your-analyzer,Running Your Analyzer>>: Detailed information about testing and running your Solr analyzer.
+
+*For Tokenizers, see:*
+
+* <<about-tokenizers.adoc#about-tokenizers,About Tokenizers>>: Detailed conceptual information about Solr tokenizers.
+* <<tokenizers.adoc#tokenizers,Tokenizers>>: Information about configuring tokenizers, and about the tokenizer factory classes included in this distribution of Solr.
+
+*For Filters, see:*
+
+* <<about-filters.adoc#about-filters,About Filters>>: Detailed conceptual information about Solr filters.
+* <<filter-descriptions.adoc#filter-descriptions,Filter Descriptions>>: Information about configuring filters, and about the filter factory classes included in this distribution of Solr.
+* <<charfilterfactories.adoc#charfilterfactories,CharFilterFactories>>: Information about filters for pre-processing input characters.
+
+*To find out how to use Tokenizers and Filters with various languages, see:*
+
+* <<language-analysis.adoc#language-analysis,Language Analysis>>: Information about tokenizers and filters for character set conversion or for use with specific languages.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/update-request-processors.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/update-request-processors.adoc b/solr/solr-ref-guide/src/update-request-processors.adoc
new file mode 100644
index 0000000..badee9e
--- /dev/null
+++ b/solr/solr-ref-guide/src/update-request-processors.adoc
@@ -0,0 +1,355 @@
+= Update Request Processors
+:page-shortname: update-request-processors
+:page-permalink: update-request-processors.html
+
+Every update request received by Solr is run through a chain of plugins known as Update Request Processors, or __URPs__. This can be useful, for example, to add a field to the document being indexed; to change the value of a particular field; or to drop an update if the incoming document doesn't fulfill certain criteria. In fact, a surprisingly large number of features in Solr are implemented as Update Processors and therefore it is necessary to understand how such plugins work and where are they configured.
+
+[[UpdateRequestProcessors-AnatomyandLifecycle]]
+== Anatomy and Lifecycle
+
+An Update Request Processor is created as part of a {solr-javadocs}/solr-core/org/apache/solr/update/processor/UpdateRequestProcessorChain.html[chain] of one or more update processors. Solr creates a default update request processor chain comprising of a few update request processors which enable essential Solr features. This default chain is used to process every update request unless a user chooses to configure and specify a different custom update request processor chain.
+
+The easiest way to describe an Update Request Processor is to look at the Javadocs of the abstract class {solr-javadocs}//solr-core/org/apache/solr/update/processor/UpdateRequestProcessor.html[UpdateRequestProcessor]. Every UpdateRequestProcessor must have a corresponding factory class which extends {solr-javadocs}/solr-core/org/apache/solr/update/processor/UpdateRequestProcessorFactory.html[UpdateRequestProcessorFactory]. This factory class is used by Solr to create a new instance of this plugin. Such a design provides two benefits:
+
+1.  An update request processor need not be thread safe because it is used by one and only one request thread and destroyed once the request is complete.
+2.  The factory class can accept configuration parameters and maintain any state that may be required between requests. The factory class must be thread-safe.
+
+Every update request processor chain is constructed during loading of a Solr core and cached until the core is unloaded. Each `UpdateRequestProcessorFactory` specified in the chain is also instantiated and initialized with configuration that may have been specified in `solrconfig.xml`.
+
+When an update request is received by Solr, it looks up the update chain to be used for this request. A new instance of each UpdateRequestProcessor specified in the chain is created using the corresponding factory. The update request is parsed into corresponding {solr-javadocs}/solr-core/org/apache/solr/update/UpdateCommand.html[UpdateCommand] objects which are run through the chain. Each UpdateRequestProcessor instance is responsible for invoking the next plugin in the chain. It can choose to short circuit the chain by not invoking the next processor and even abort further processing by throwing an exception.
+
+[NOTE]
+====
+
+A single update request may contain a batch of multiple new documents or deletes and therefore the corresponding processXXX methods of an UpdateRequestProcessor will be invoked multiple times for every individual update. However, it is guaranteed that a single thread will serially invoke these methods.
+
+====
+
+[[UpdateRequestProcessors-Configuration]]
+== Configuration
+
+Update request processors chains can be created by either creating the whole chain directly in `solrconfig.xml` or by creating individual update processors in `solrconfig.xml` and then dynamically creating the chain at run-time by specifying all processors via request parameters.
+
+However, before we understand how to configure update processor chains, we must learn about the default update processor chain because it provides essential features which are needed in most custom request processor chains as well.
+
+[[UpdateRequestProcessors-DefaultUpdateRequestProcessorChain]]
+=== Default Update Request Processor Chain
+
+In case no update processor chains are configured in `solrconfig.xml`, Solr will automatically create a default update processor chain which will be used for all update requests. This default update processor chain consists of the following processors (in order):
+
+1.  `LogUpdateProcessorFactory` - Tracks the commands processed during this request and logs them
+2.  `DistributedUpdateProcessorFactory` - Responsible for distributing update requests to the right node e.g., routing requests to the leader of the right shard and distributing updates from the leader to each replica. This processor is activated only in SolrCloud mode.
+3.  `RunUpdateProcessorFactory` - Executes the update using internal Solr APIs.
+
+Each of these perform an essential function and as such any custom chain usually contain all of these processors. The `RunUpdateProcessorFactory` is usually the last update processor in any custom chain.
+
+[[UpdateRequestProcessors-CustomUpdateRequestProcessorChain]]
+=== Custom Update Request Processor Chain
+
+The following example demonstrates how a custom chain can be configured inside `solrconfig.xml`.
+
+*updateRequestProcessorChain*
+
+[source,xml]
+----
+<updateRequestProcessorChain name="dedupe">
+  <processor class="solr.processor.SignatureUpdateProcessorFactory">
+    <bool name="enabled">true</bool>
+    <str name="signatureField">id</str>
+    <bool name="overwriteDupes">false</bool>
+    <str name="fields">name,features,cat</str>
+    <str name="signatureClass">solr.processor.Lookup3Signature</str>
+  </processor>
+  <processor class="solr.LogUpdateProcessorFactory" />
+  <processor class="solr.RunUpdateProcessorFactory" />
+</updateRequestProcessorChain>
+----
+
+In the above example, a new update processor chain named "dedupe" is created with `SignatureUpdateProcessorFactory`, `LogUpdateProcessorFactory` and `RunUpdateProcessorFactory` in the chain. The `SignatureUpdateProcessorFactory` is further configured with different parameters such as "signatureField", "overwriteDupes", etc. This chain is an example of how Solr can be configured to perform de-duplication of documents by calculating a signature using the value of name, features, cat fields which is then used as the "id" field. As you may have noticed, this chain does not specify the `DistributedUpdateProcessorFactory`. Because this processor is critical for Solr to operate properly, Solr will automatically insert `DistributedUpdateProcessorFactory` in any chain that does not include it just prior to the `RunUpdateProcessorFactory`.
+
+.RunUpdateProcessorFactory
+[WARNING]
+====
+
+Do not forget to add `RunUpdateProcessorFactory` at the end of any chains you define in `solrconfig.xml`. Otherwise update requests processed by that chain will not actually affect the indexed data.
+
+====
+
+[[UpdateRequestProcessors-ConfiguringIndividualProcessorsasTop-LevelPlugins]]
+=== Configuring Individual Processors as Top-Level Plugins
+
+Update request processors can also be configured independent of a chain in `solrconfig.xml`.
+
+*updateProcessor*
+
+[source,xml]
+----
+<updateProcessor class="solr.processor.SignatureUpdateProcessorFactory" name="signature">
+  <bool name="enabled">true</bool>
+  <str name="signatureField">id</str>
+  <bool name="overwriteDupes">false</bool>
+  <str name="fields">name,features,cat</str>
+  <str name="signatureClass">solr.processor.Lookup3Signature</str>
+</updateProcessor>
+<updateProcessor class="solr.RemoveBlankFieldUpdateProcessorFactory" name="remove_blanks"/>
+----
+
+In this case, an instance of `SignatureUpdateProcessorFactory` is configured with the name "signature" and a `RemoveBlankFieldUpdateProcessorFactory` is defined with the name "remove_blanks". Once the above has been specified in `solrconfig.xml`, we can be refer to them in update request processor chains in `solrconfig.xml` as follows:
+
+*updateRequestProcessorChains and updateProcessors*
+
+[source,xml]
+----
+<updateProcessorChain name="custom" processor="remove_blanks,signature">
+  <processor class="solr.RunUpdateProcessorFactory" />
+</updateProcessorChain>
+----
+
+[[UpdateRequestProcessors-UpdateProcessorsinSolrCloud]]
+== Update Processors in SolrCloud
+
+In a single node, stand-alone Solr, each update is run through all the update processors in a chain exactly once. But the behavior of update request processors in SolrCloud deserves special consideration.
+
+A critical SolrCloud functionality is the routing and distributing of requests. For update requests this routing is implemented by the `DistributedUpdateRequestProcessor`, and this processor is given a special status by Solr due to its important function.
+
+In SolrCloud mode, all processors in the chain _before_ the `DistributedUpdateProcessor` are run on the first node that receives an update from the client, regardless of this node's status as a leader or replica. The `DistributedUpdateProcessor` then forwards the update to the appropriate shard leader for the update (or to multiple leaders in the event of an update that affects multiple documents, such as a delete by query or commit). The shard leader uses a transaction log to apply <<updating-parts-of-documents.adoc#updating-parts-of-documents,Atomic Updates & Optimistic Concurrency>> and then forwards the update to all of the shard replicas. The leader and each replica run all of the processors in the chain that are listed _after_ the `DistributedUpdateProcessor`.
+
+For example, consider the "dedupe" chain which we saw in a section above. Assume that a 3-node SolrCloud cluster exists where node A hosts the leader of shard1, node B hosts the leader of shard2 and node C hosts the replica of shard2. Assume that an update request is sent to node A which forwards the update to node B (because the update belongs to shard2) which then distributes the update to its replica node C. Let's see what happens at each node:
+
+* **Node A**: Runs the update through the `SignatureUpdateProcessor` (which computes the signature and puts it in the "id" field), then `LogUpdateProcessor` and then `DistributedUpdateProcessor`. This processor determines that the update actually belongs to node B and is forwarded to node B. The update is not processed further. This is required because the next processor, `RunUpdateProcessor`, will execute the update against the local shard1 index which would lead to duplicate data on shard1 and shard2.
+* **Node B**: Receives the update and sees that it was forwarded by another node. The update is directly sent to `DistributedUpdateProcessor` because it has already been through the `SignatureUpdateProcessor` on node A and doing the same signature computation again would be redundant. The `DistributedUpdateProcessor` determines that the update indeed belongs to this node, distributes it to its replica on Node C and then forwards the update further in the chain to `RunUpdateProcessor`.
+* **Node C**: Receives the update and sees that it was distributed by its leader. The update is directly sent to `DistributedUpdateProcessor` which performs some consistency checks and forwards the update further in the chain to `RunUpdateProcessor`.
+
+In summary:
+
+1.  All processors before `DistributedUpdateProcessor` are only run on the first node that receives an update request whether it be a forwarding node (e.g., node A in the above example) or a leader (e.g., node B). We call these "pre-processors" or just "processors".
+2.  All processors after `DistributedUpdateProcessor` run only on the leader and the replica nodes. They are not executed on forwarding nodes. Such processors are called "post-processors".
+
+In the previous section, we saw that the `updateRequestProcessorChain` was configured with `processor="remove_blanks, signature"`. This means that such processors are of the #1 kind and are run only on the forwarding nodes. Similarly, we can configure them as the #2 kind by specifying with the attribute "post-processor" as follows:
+
+*post-processors*
+
+[source,xml]
+----
+<updateProcessorChain name="custom" processor="signature" post-processor="remove_blanks">
+  <processor class="solr.RunUpdateProcessorFactory" />
+</updateProcessorChain>
+----
+
+However executing a processor only on the forwarding nodes is a great way of distributing an expensive computation such as de-duplication across a SolrCloud cluster by sending requests randomly via a load balancer. Otherwise the expensive computation is repeated on both the leader and replica nodes.
+
+.Pre-processors and Atomic Updates
+[WARNING]
+====
+
+Because `DistributedUpdateProcessor` is responsible for processing <<updating-parts-of-documents.adoc#updating-parts-of-documents,Atomic Updates>> into full documents on the leader node, this means that pre-processors which are executed only on the forwarding nodes can only operate on the partial document. If you have a processor which must process a full document then the only choice is to specify it as a post-processor.
+
+====
+
+.Custom update chain post-processors may never be invoked on a recovering replica
+[WARNING]
+====
+
+While a replica is in <<read-and-write-side-fault-tolerance.adoc#ReadandWriteSideFaultTolerance-WriteSideFaultTolerance,recovery>>, inbound update requests are buffered to the transaction log. After recovery has completed successfully, those buffered update requests are replayed. As of this writing, however, custom update chain post-processors are never invoked for buffered update requests. See https://issues.apache.org/jira/browse/SOLR-8030[SOLR-8030]. To work around this problem until SOLR-8030 has been fixed, **avoid specifying post-processors in custom update chains**.
+
+====
+
+[[UpdateRequestProcessors-UsingCustomChains]]
+== Using Custom Chains
+
+[[UpdateRequestProcessors-update.chainRequestParameter]]
+=== update.chain Request Parameter
+
+The `update.chain` parameter can be used in any update request to choose a custom chain which has been configured in `solrconfig.xml`. For example, in order to choose the "dedupe" chain described in a previous section, one can issue the following request:
+
+*update.chain*
+
+[source,bash]
+----
+curl "http://localhost:8983/solr/gettingstarted/update/json?update.chain=dedupe&commit=true" -H 'Content-type: application/json' -d '
+[
+  {
+    "name" : "The Lightning Thief",
+    "features" : "This is just a test",
+    "cat" : ["book","hardcover"]
+  },
+  {
+    "name" : "The Lightning Thief",
+    "features" : "This is just a test",
+    "cat" : ["book","hardcover"]
+  }
+]'
+----
+
+The above should dedupe the two identical documents and index only one of them.
+
+// OLD_CONFLUENCE_ID: UpdateRequestProcessors-Processor&Post-ProcessorRequestParameters
+
+[[UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters]]
+=== Processor & Post-Processor Request Parameters
+
+We can dynamically construct a custom update request processor chain using the "processor" and "post-processor" request parameters. Multiple processors can be specified as a comma-separated value for these two parameters. For example:
+
+*Constructing a chain at request time*
+
+[source,bash]
+----
+# Executing processors configured in solrconfig.xml as (pre)-processors
+curl "http://localhost:8983/solr/gettingstarted/update/json?processor=remove_blanks,signature&commit=true" -H 'Content-type: application/json' -d '
+[
+  {
+    "name" : "The Lightning Thief",
+    "features" : "This is just a test",
+    "cat" : ["book","hardcover"]
+  },
+  {
+    "name" : "The Lightning Thief",
+    "features" : "This is just a test",
+    "cat" : ["book","hardcover"]
+
+  }
+]'
+ 
+# Executing processors configured in solrconfig.xml as pre- and post-processors
+curl "http://localhost:8983/solr/gettingstarted/update/json?processor=remove_blanks&post-processor=signature&commit=true" -H 'Content-type: application/json' -d '
+[
+  {
+    "name" : "The Lightning Thief",
+    "features" : "This is just a test",
+    "cat" : ["book","hardcover"]
+  },
+  {
+    "name" : "The Lightning Thief",
+    "features" : "This is just a test",
+    "cat" : ["book","hardcover"]
+  }
+]'
+----
+
+In the first example, Solr will dynamically create a chain which has "signature" and "remove_blanks" as pre-processors to be executed only on the forwarding node where as in the second example, "remove_blanks" will be executed as a pre-processor and "signature" will be executed on the leader and replicas as a post-processor.
+
+[[UpdateRequestProcessors-ConfiguringaCustomChainasaDefault]]
+=== Configuring a Custom Chain as a Default
+
+We can also specify a custom chain to be used by default for all requests sent to specific update handlers instead of specifying the names in request parameters for each request.
+
+This can be done by adding either "update.chain" or "processor" and "post-processor" as default parameter for a given path which can be done either via <<initparams-in-solrconfig.adoc#initparams-in-solrconfig,initParams>> or by adding them in a <<requesthandlers-and-searchcomponents-in-solrconfig.adoc#requesthandlers-and-searchcomponents-in-solrconfig,"defaults" section>> which is supported by all request handlers.
+
+The following is an `initParam` defined in the <<schemaless-mode.adoc#schemaless-mode,schemaless configuration>> which applies a custom update chain to all request handlers starting with "/update/".
+
+*InitParams*
+
+[source,xml]
+----
+<initParams path="/update/**">
+  <lst name="defaults">
+    <str name="update.chain">add-unknown-fields-to-the-schema</str>
+  </lst>
+</initParams>
+----
+
+Alternately, one can achieve a similar effect using the "defaults" as shown in the example below:
+
+*defaults*
+
+[source,xml]
+----
+<requestHandler name="/update/extract"
+                startup="lazy"
+                class="solr.extraction.ExtractingRequestHandler" >
+  <lst name="defaults">
+    <str name="update.chain">add-unknown-fields-to-the-schema</str>
+  </lst>
+</requestHandler>
+----
+
+[[UpdateRequestProcessors-UpdateRequestProcessorFactories]]
+== Update Request Processor Factories
+
+What follows are brief descriptions of the currently available update request processors. An `UpdateRequestProcessorFactory` can be integrated into an update chain in `solrconfig.xml` as necessary. You are strongly urged to examine the Javadocs for these classes; these descriptions are abridged snippets taken for the most part from the Javadocs.
+
+[[UpdateRequestProcessors-GeneralUseUpdateProcessorFactories]]
+=== General Use UpdateProcessorFactories
+
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/AddSchemaFieldsUpdateProcessorFactory.html[AddSchemaFieldsUpdateProcessorFactory]: This processor will dynamically add fields to the schema if an input document contains one or more fields that don't match any field or dynamic field in the schema.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ClassificationUpdateProcessorFactory.html[ClassificationUpdateProcessorFactory]: This processor uses Lucene's classification module to provide simple document classification. See https://wiki.apache.org/solr/SolrClassification for more details on how to use this processor.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html[CloneFieldUpdateProcessorFactory]: Clones the values found in any matching _source_ field into the configured _dest_ field.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DefaultValueUpdateProcessorFactory.html[DefaultValueUpdateProcessorFactory]: A simple processor that adds a default value to any document which does not already have a value in fieldName.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.html[DocBasedVersionConstraintsProcessorFactory]: This Factory generates an UpdateProcessor that helps to enforce version constraints on documents based on per-document version numbers using a configured name of a versionField.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DocExpirationUpdateProcessorFactory.html[DocExpirationUpdateProcessorFactory]: Update Processor Factory for managing automatic "expiration" of documents.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldNameMutatingUpdateProcessorFactory.html[FieldNameMutatingUpdateProcessorFactory]: Modifies field names by replacing all matches to the configured `pattern` with the configured `replacement`.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/IgnoreCommitOptimizeUpdateProcessorFactory.html[IgnoreCommitOptimizeUpdateProcessorFactory]: Allows you to ignore commit and/or optimize requests from client applications when running in SolrCloud mode, for more information, see: Shards and Indexing Data in SolrCloud
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RegexpBoostProcessorFactory.html[RegexpBoostProcessorFactory]: A processor which will match content of "inputField" against regular expressions found in "boostFilename", and if it matches will return the corresponding boost value from the file and output this to "boostField" as a double value.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/SignatureUpdateProcessorFactory.html[SignatureUpdateProcessorFactory]: Uses a defined set of fields to generate a hash "signature" for the document. Useful for only indexing one copy of "similar" documents.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html[StatelessScriptUpdateProcessorFactory]: An update request processor factory that enables the use of update processors implemented as scripts.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/TimestampUpdateProcessorFactory.html[TimestampUpdateProcessorFactory]: An update processor that adds a newly generated date value of "NOW" to any document being added that does not already have a value in the specified field.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/URLClassifyProcessorFactory.html[URLClassifyProcessorFactory]: Update processor which examines a URL and outputs to various other fields with characteristics of that URL, including length, number of path levels, whether it is a top level URL (levels==0), whether it looks like a landing/index page, a canonical representation of the URL (e.g., stripping index.html), the domain and path parts of the URL, etc.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html[UUIDUpdateProcessorFactory]: An update processor that adds a newly generated UUID value to any document being added that does not already have a value in the specified field.
+
+[[UpdateRequestProcessors-FieldMutatingUpdateProcessorFactoryDerivedFactories]]
+=== FieldMutatingUpdateProcessorFactory Derived Factories
+
+These factories all provide functionality to _modify_ fields in a document as they're being indexed. When using any of these factories, please consult the {solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html[FieldMutatingUpdateProcessorFactory javadocs] for details on the common options they all support for configuring which fields are modified.
+
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html[ConcatFieldUpdateProcessorFactory]: Concatenates multiple values for fields matching the specified conditions using a configurable delimiter.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/CountFieldValuesUpdateProcessorFactory.html[CountFieldValuesUpdateProcessorFactory]: Replaces any list of values for a field matching the specified conditions with the the count of the number of values for that field.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html[FieldLengthUpdateProcessorFactory]: Replaces any CharSequence values found in fields matching the specified conditions with the lengths of those CharSequences (as an Integer).
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/FirstFieldValueUpdateProcessorFactory.html[FirstFieldValueUpdateProcessorFactory]: Keeps only the first value of fields matching the specified conditions.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html[HTMLStripFieldUpdateProcessorFactory]: Strips all HTML Markup in any CharSequence values found in fields matching the specified conditions.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html[IgnoreFieldUpdateProcessorFactory]: Ignores and removes fields matching the specified conditions from any document being added to the index.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/LastFieldValueUpdateProcessorFactory.html[LastFieldValueUpdateProcessorFactory]: Keeps only the last value of fields matching the specified conditions.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/MaxFieldValueUpdateProcessorFactory.html[MaxFieldValueUpdateProcessorFactory]: An update processor that keeps only the the maximum value from any selected fields where multiple values are found.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/MinFieldValueUpdateProcessorFactory.html[MinFieldValueUpdateProcessorFactory]: An update processor that keeps only the the minimum value from any selected fields where multiple values are found.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseBooleanFieldUpdateProcessorFactory.html[ParseBooleanFieldUpdateProcessorFactory]: Attempts to mutate selected fields that have only CharSequence-typed values into Boolean values.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html[ParseDateFieldUpdateProcessorFactory]: Attempts to mutate selected fields that have only CharSequence-typed values into Solr date values.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseNumericFieldUpdateProcessorFactory.html[ParseNumericFieldUpdateProcessorFactory] derived classes:
+** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseDoubleFieldUpdateProcessorFactory.html[ParseDoubleFieldUpdateProcessorFactory]: Attempts to mutate selected fields that have only CharSequence-typed values into Double values.
+** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseFloatFieldUpdateProcessorFactory.html[ParseFloatFieldUpdateProcessorFactory]: Attempts to mutate selected fields that have only CharSequence-typed values into Float values.
+** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseIntFieldUpdateProcessorFactory.html[ParseIntFieldUpdateProcessorFactory]: Attempts to mutate selected fields that have only CharSequence-typed values into Integer values.
+** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseLongFieldUpdateProcessorFactory.html[ParseLongFieldUpdateProcessorFactory]: Attempts to mutate selected fields that have only CharSequence-typed values into Long values.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/PreAnalyzedUpdateProcessorFactory.html[PreAnalyzedUpdateProcessorFactory]: An update processor that parses configured fields of any document being added using _PreAnalyzedField_ with the configured format parser.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html[RegexReplaceProcessorFactory]: An updated processor that applies a configured regex to any CharSequence values found in the selected fields, and replaces any matches with the configured replacement string.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RemoveBlankFieldUpdateProcessorFactory.html[RemoveBlankFieldUpdateProcessorFactory]: Removes any values found which are CharSequence with a length of 0. (ie: empty strings).
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/TrimFieldUpdateProcessorFactory.html[TrimFieldUpdateProcessorFactory]: Trims leading and trailing whitespace from any CharSequence values found in fields matching the specified conditions.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/TruncateFieldUpdateProcessorFactory.html[TruncateFieldUpdateProcessorFactory]: Truncates any CharSequence values found in fields matching the specified conditions to a maximum character length.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/UniqFieldsUpdateProcessorFactory.html[UniqFieldsUpdateProcessorFactory]: Removes duplicate values found in fields matching the specified conditions.
+
+[[UpdateRequestProcessors-UpdateProcessorFactoriesThatCanBeLoadedasPlugins]]
+=== Update Processor Factories That Can Be Loaded as Plugins
+
+These processors are included in Solr releases as "contribs", and require additional jars loaded at runtime. See the README files associated with each contrib for details:
+
+* The {solr-javadocs}/solr-langid/index.html[`langid`] contrib provides:
+** {solr-javadocs}/solr-langid/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactory.html[LangDetectLanguageIdentifierUpdateProcessorFactory]: Identifies the language of a set of input fields using http://code.google.com/p/language-detection.
+** {solr-javadocs}/solr-langid/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactory.html[TikaLanguageIdentifierUpdateProcessorFactory]: Identifies the language of a set of input fields using Tika's LanguageIdentifier.
+* The {solr-javadocs}/solr-uima/index.html[`uima`] contrib provides:
+** {solr-javadocs}/solr-uima/org/apache/solr/uima/processor/UIMAUpdateRequestProcessorFactory.html[UIMAUpdateRequestProcessorFactory]: Update document(s) to be indexed with UIMA extracted information.
+
+[[UpdateRequestProcessors-UpdateProcessorFactoriesYouShouldNotModifyorRemove]]
+=== Update Processor Factories You Should _Not_ Modify or Remove
+
+These are listed for completeness, but are part of the Solr infrastructure, particularly SolrCloud. Other than insuring you do _not_ remove them when modifying the update request handlers (or any copies you make), you will rarely, if ever, need to change these.
+
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DistributedUpdateProcessorFactory.html[DistributedUpdateProcessorFactory]: Used to distribute updates to all necessary nodes.
+** {solr-javadocs}/solr-core/org/apache/solr/update/processor/NoOpDistributingUpdateProcessorFactory.html[NoOpDistributingUpdateProcessorFactory]: An alternative No-Op implementation of `DistributingUpdateProcessorFactory` that always returns null. Designed for experts who want to bypass distributed updates and use their own custom update logic.
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/LogUpdateProcessorFactory.html[LogUpdateProcessorFactory]: A logging processor. This keeps track of all commands that have passed through the chain and prints them on finish().
+* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RunUpdateProcessorFactory.html[RunUpdateProcessorFactory]: Executes the update commands using the underlying UpdateHandler. Almost all processor chains should end with an instance of `RunUpdateProcessorFactory` unless the user is explicitly executing the update commands in an alternative custom `UpdateRequestProcessorFactory`.
+
+[[UpdateRequestProcessors-UpdateProcessorsThatCanBeUsedatRuntime]]
+=== Update Processors That Can Be Used at Runtime
+
+[[UpdateRequestProcessors-TemplateUpdateProcessorFactory]]
+==== TemplateUpdateProcessorFactory
+
+The `TemplateUpdateProcessorFactory` can be used to add new fields to documents based on a template pattern.
+
+This can be used directly in a request without any configuration. To enable this processor, use the parameter `processor=Template`. The template parameter `Template.field` (multivalued) define the field to add and the pattern. Templates may contain placeholders which refer to other fields in the document. You can have multiple `Template.field` parameters in a single request.
+
+For example:
+
+[source,bash]
+----
+processor=Template&Template.field=fullName:Mr. ${firstName} ${lastName}
+----
+
+The above example would add a new field to the document called `fullName`. The fields `firstName and` `lastName` are supplied from the document fields. If either of them is missing, that part is replaced with an empty string. If those fields are multi-valued, only the first value is used.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/updatehandlers-in-solrconfig.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/updatehandlers-in-solrconfig.adoc b/solr/solr-ref-guide/src/updatehandlers-in-solrconfig.adoc
new file mode 100644
index 0000000..3c568fd
--- /dev/null
+++ b/solr/solr-ref-guide/src/updatehandlers-in-solrconfig.adoc
@@ -0,0 +1,128 @@
+= UpdateHandlers in SolrConfig
+:page-shortname: updatehandlers-in-solrconfig
+:page-permalink: updatehandlers-in-solrconfig.html
+
+The settings in this section are configured in the `<updateHandler>` element in `solrconfig.xml` and may affect the performance of index updates. These settings affect how updates are done internally. `<updateHandler>` configurations do not affect the higher level configuration of <<requesthandlers-and-searchcomponents-in-solrconfig.adoc#requesthandlers-and-searchcomponents-in-solrconfig,RequestHandlers>> that process client update requests.
+
+[source,xml]
+----
+<updateHandler class="solr.DirectUpdateHandler2">
+  ...
+</updateHandler>
+----
+
+[[UpdateHandlersinSolrConfig-Commits]]
+== Commits
+
+Data sent to Solr is not searchable until it has been _committed_ to the index. The reason for this is that in some cases commits can be slow and they should be done in isolation from other possible commit requests to avoid overwriting data. So, it's preferable to provide control over when data is committed. Several options are available to control the timing of commits.
+
+[[UpdateHandlersinSolrConfig-commitandsoftCommit]]
+=== `commit` and `softCommit`
+
+In Solr, a `commit` is an action which asks Solr to "commit" those changes to the Lucene index files. By default commit actions result in a "hard commit" of all the Lucene index files to stable storage (disk). When a client includes a `commit=true` parameter with an update request, this ensures that all index segments affected by the adds & deletes on an update are written to disk as soon as index updates are completed.
+
+If an additional flag `softCommit=true` is specified, then Solr performs a 'soft commit', meaning that Solr will commit your changes to the Lucene data structures quickly but not guarantee that the Lucene index files are written to stable storage. This is an implementation of Near Real Time storage, a feature that boosts document visibility, since you don't have to wait for background merges and storage (to ZooKeeper, if using <<solrcloud.adoc#solrcloud,SolrCloud>>) to finish before moving on to something else. A full commit means that, if a server crashes, Solr will know exactly where your data was stored; a soft commit means that the data is stored, but the location information isn't yet stored. The tradeoff is that a soft commit gives you faster visibility because it's not waiting for background merges to finish.
+
+For more information about Near Real Time operations, see <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time Searching>>.
+
+[[UpdateHandlersinSolrConfig-autoCommit]]
+=== `autoCommit`
+
+These settings control how often pending updates will be automatically pushed to the index. An alternative to `autoCommit` is to use `commitWithin`, which can be defined when making the update request to Solr (i.e., when pushing documents), or in an update RequestHandler.
+
+[width="100%",cols="50%,50%",options="header",]
+|===
+|Setting |Description
+|maxDocs |The number of updates that have occurred since the last commit.
+|maxTime |The number of milliseconds since the oldest uncommitted update.
+|openSearcher |Whether to open a new searcher when performing a commit. If this is **false**, the commit will flush recent index changes to stable storage, but does not cause a new searcher to be opened to make those changes visible. The default is **true**.
+|===
+
+If either of these `maxDocs` or `maxTime` limits are reached, Solr automatically performs a commit operation. If the `autoCommit` tag is missing, then only explicit commits will update the index. The decision whether to use auto-commit or not depends on the needs of your application.
+
+Determining the best auto-commit settings is a tradeoff between performance and accuracy. Settings that cause frequent updates will improve the accuracy of searches because new content will be searchable more quickly, but performance may suffer because of the frequent updates. Less frequent updates may improve performance but it will take longer for updates to show up in queries.
+
+[source,xml]
+----
+<autoCommit>
+  <maxDocs>10000</maxDocs>
+  <maxTime>30000</maxTime>
+  <openSearcher>false</openSearcher>
+</autoCommit>
+----
+
+You can also specify 'soft' autoCommits in the same way that you can specify 'soft' commits, except that instead of using `autoCommit` you set the `autoSoftCommit` tag.
+
+[source,xml]
+----
+<autoSoftCommit> 
+  <maxTime>60000</maxTime> 
+</autoSoftCommit>
+----
+
+[[UpdateHandlersinSolrConfig-commitWithin]]
+=== `commitWithin`
+
+The `commitWithin` settings allow forcing document commits to happen in a defined time period. This is used most frequently with <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time Searching>>, and for that reason the default is to perform a soft commit. This does not, however, replicate new documents to slave servers in a master/slave environment. If that's a requirement for your implementation, you can force a hard commit by adding a parameter, as in this example:
+
+[source,xml]
+----
+<commitWithin>
+  <softCommit>false</softCommit>
+</commitWithin>
+----
+
+With this configuration, when you call `commitWithin` as part of your update message, it will automatically perform a hard commit every time.
+
+[[UpdateHandlersinSolrConfig-EventListeners]]
+== Event Listeners
+
+The UpdateHandler section is also where update-related event listeners can be configured. These can be triggered to occur after any commit (`event="postCommit"`) or only after optimize commands (`event="postOptimize"`).
+
+Users can write custom update event listener classes, but a common use case is to run external executables via the `RunExecutableListener`:
+
+[width="100%",cols="50%,50%",options="header",]
+|===
+|Setting |Description
+|exe |The name of the executable to run. It should include the path to the file, relative to Solr home.
+|dir |The directory to use as the working directory. The default is ".".
+|wait |Forces the calling thread to wait until the executable returns a response. The default is **true**.
+|args |Any arguments to pass to the program. The default is none.
+|env |Any environment variables to set. The default is none.
+|===
+
+[[UpdateHandlersinSolrConfig-TransactionLog]]
+== Transaction Log
+
+As described in the section <<realtime-get.adoc#realtime-get,RealTime Get>>, a transaction log is required for that feature. It is configured in the `updateHandler` section of `solrconfig.xml`.
+
+Realtime Get currently relies on the update log feature, which is enabled by default. It relies on an update log, which is configured in `solrconfig.xml`, in a section like:
+
+[source,xml]
+----
+<updateLog>
+  <str name="dir">${solr.ulog.dir:}</str>
+</updateLog>
+----
+
+Three additional expert-level configuration settings affect indexing performance and how far a replica can fall behind on updates before it must enter into full recovery - see the section on <<read-and-write-side-fault-tolerance.adoc#ReadandWriteSideFaultTolerance-WriteSideFaultTolerance,write side fault tolerance>> for more information:
+
+[width="100%",cols="25%,25%,25%,25%",options="header",]
+|===
+|Setting Name |Type |Default |Description
+|numRecordsToKeep |int |100 |The number of update records to keep per log
+|maxNumLogsToKeep |int |10 |The maximum number of logs keep
+|numVersionBuckets |int |65536 |The number of buckets used to keep track of max version values when checking for re-ordered updates; increase this value to reduce the cost of synchronizing access to version buckets during high-volume indexing, this requires (8 bytes (long) * numVersionBuckets) of heap space per Solr core.
+|===
+
+An example, to be included under `<config><updateHandler>` in `solrconfig.xml`, employing the above advanced settings:
+
+[source,xml]
+----
+<updateLog>
+  <str name="dir">${solr.ulog.dir:}</str>
+  <int name="numRecordsToKeep">500</int>
+  <int name="maxNumLogsToKeep">20</int>
+  <int name="numVersionBuckets">65536</int>
+</updateLog>
+----

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/updating-parts-of-documents.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/updating-parts-of-documents.adoc b/solr/solr-ref-guide/src/updating-parts-of-documents.adoc
new file mode 100644
index 0000000..249d1a4
--- /dev/null
+++ b/solr/solr-ref-guide/src/updating-parts-of-documents.adoc
@@ -0,0 +1,193 @@
+= Updating Parts of Documents
+:page-shortname: updating-parts-of-documents
+:page-permalink: updating-parts-of-documents.html
+
+Once you have indexed the content you need in your Solr index, you will want to start thinking about your strategy for dealing with changes to those documents. Solr supports two approaches to updating documents that have only partially changed.
+
+The first is __<<UpdatingPartsofDocuments-AtomicUpdates,atomic updates>>__. This approach allows changing only one or more fields of a document without having to re-index the entire document.
+
+The second approach is known as _<<UpdatingPartsofDocuments-OptimisticConcurrency,optimistic concurrency>>_ or __optimistic locking__. It is a feature of many NoSQL databases, and allows conditional updating a document based on its version. This approach includes semantics and rules for how to deal with version matches or mis-matches.
+
+Atomic Updates and Optimistic Concurrency may be used as independent strategies for managing changes to documents, or they may be combined: you can use optimistic concurrency to conditionally apply an atomic update.
+
+[[UpdatingPartsofDocuments-AtomicUpdates]]
+== Atomic Updates
+
+Solr supports several modifiers that atomically update values of a document. This allows updating only specific fields, which can help speed indexing processes in an environment where speed of index additions is critical to the application.
+
+To use atomic updates, add a modifier to the field that needs to be updated. The content can be updated, added to, or incrementally increased if a number.
+
+// TODO: This table has cells that won't work with PDF: https://github.com/ctargett/refguide-asciidoc-poc/issues/13
+
+[width="100%",cols="50%,50%",options="header",]
+|===
+|Modifier |Usage
+|set a|
+Set or replace the field value(s) with the specified value(s), or remove the values if 'null' or empty list is specified as the new value.
+
+May be specified as a single value, or as a list for multivalued fields
+
+|add a|
+Adds the specified values to a multivalued field.
+
+May be specified as a single value, or as a list.
+
+|remove a|
+Removes (all occurrences of) the specified values from a multivalued field.
+
+May be specified as a single value, or as a list.
+
+|removeregex a|
+Removes all occurrences of the specified regex from a multiValued field.
+
+May be specified as a single value, or as a list.
+
+|inc a|
+Increments a numeric value by a specific amount.
+
+Must be specified as a single numeric value.
+
+|===
+
+[[UpdatingPartsofDocuments-FieldStorage]]
+=== Field Storage
+
+The core functionality of atomically updating a document requires that all fields in your schema must be configured as stored (stored="true") or docValues(docValues="true") except for fields which are <copyField/> destinations, which must be configured as stored="false". Atomic updates are applied to the document represented by the existing stored field values. All data in these fields must originate from ONLY copyField sources.
+
+If `<copyField/>` destinations are configured as stored, then Solr will attempt to index both the current value of the field as well as an additional copy from any source fields. If such fields contain some information that comes from the indexing program and some information that comes from copyField, then the information which originally came from the indexing program will likely be lost when an atomic update is made.
+
+[[UpdatingPartsofDocuments-Example]]
+=== Example
+
+If the following document exists in our collection:
+
+[source,plain]
+----
+{"id":"mydoc", 
+ "price":10, 
+ "popularity":42,
+ "categories":["kids"],
+ "promo_ids":["a123x"],
+ "tags":["free_to_try","buy_now","clearance","on_sale"] 
+}
+----
+
+And we apply the following update command:
+
+[source,plain]
+----
+{"id":"mydoc", 
+ "price":{"set":99}, 
+ "popularity":{"inc":20},
+ "categories":{"add":["toys","games"]},
+ "promo_ids":{"remove":"a123x"},
+ "tags":{"remove":["free_to_try","on_sale"]}
+}
+----
+
+The resulting document in our collection will be:
+
+[source,plain]
+----
+{"id":"mydoc", 
+ "price":99, 
+ "popularity":62,
+ "categories":["kids","toys","games"],
+ "tags":["buy_now","clearance"] 
+}
+----
+
+[[UpdatingPartsofDocuments-OptimisticConcurrency]]
+== Optimistic Concurrency
+
+Optimistic Concurrency is a feature of Solr that can be used by client applications which update/replace documents to ensure that the document they are replacing/updating has not been concurrently modified by another client application. This feature works by requiring a `_version_` field on all documents in the index, and comparing that to a `_version_` specified as part of the update command. By default, Solr's Schema includes a `_version_` field, and this field is automatically added to each new document.
+
+In general, using optimistic concurrency involves the following work flow:
+
+1.  A client reads a document. In Solr, one might retrieve the document with the `/get` handler to be sure to have the latest version.
+2.  A client changes the document locally.
+3.  The client resubmits the changed document to Solr, for example, perhaps with the `/update` handler.
+4.  If there is a version conflict (HTTP error code 409), the client starts the process over.
+
+When the client resubmits a changed document to Solr, the `_version_` can be included with the update to invoke optimistic concurrency control. Specific semantics are used to define when the document should be updated or when to report a conflict.
+
+* If the content in the `_version_` field is greater than '1' (i.e., '12345'), then the `_version_` in the document must match the `_version_` in the index.
+* If the content in the `_version_` field is equal to '1', then the document must simply exist. In this case, no version matching occurs, but if the document does not exist, the updates will be rejected.
+* If the content in the `_version_` field is less than '0' (i.e., '-1'), then the document must *not* exist. In this case, no version matching occurs, but if the document exists, the updates will be rejected.
+* If the content in the `_version_` field is equal to '0', then it doesn't matter if the versions match or if the document exists or not. If it exists, it will be overwritten; if it does not exist, it will be added.
+
+If the document being updated does not include the `_version_` field, and atomic updates are not being used, the document will be treated by normal Solr rules, which is usually to discard the previous version.
+
+When using Optimistic Concurrency, clients can include an optional `versions=true` request parameter to indicate that the _new_ versions of the documents being added should be included in the response. This allows clients to immediately know what the `_version_` is of every documented added with out needing to make a redundant <<realtime-get.adoc#realtime-get,`/get` request>>.
+
+For example...
+
+[source,bash]
+----
+$ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?versions=true' --data-binary '
+[ { "id" : "aaa" }, 
+  { "id" : "bbb" } ]'
+{"responseHeader":{"status":0,"QTime":6},
+ "adds":["aaa",1498562471222312960,
+         "bbb",1498562471225458688]}
+$ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?_version_=999999&versions=true' --data-binary '
+[{ "id" : "aaa", 
+   "foo_s" : "update attempt with wrong existing version" }]'
+{"responseHeader":{"status":409,"QTime":3},
+ "error":{"msg":"version conflict for aaa expected=999999 actual=1498562471222312960",
+          "code":409}}
+$ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?_version_=1498562471222312960&versions=true&commit=true' --data-binary '
+[{ "id" : "aaa", 
+   "foo_s" : "update attempt with correct existing version" }]'
+{"responseHeader":{"status":0,"QTime":5},
+ "adds":["aaa",1498562624496861184]}
+$ curl 'http://localhost:8983/solr/techproducts/query?q=*:*&fl=id,_version_'
+{
+  "responseHeader":{
+    "status":0,
+    "QTime":5,
+    "params":{
+      "fl":"id,_version_",
+      "q":"*:*"}},
+  "response":{"numFound":2,"start":0,"docs":[
+      {
+        "id":"bbb",
+        "_version_":1498562471225458688},
+      {
+        "id":"aaa",
+        "_version_":1498562624496861184}]
+  }} 
+----
+
+For more information, please also see https://www.youtube.com/watch?v=WYVM6Wz-XTw[Yonik Seeley's presentation on NoSQL features in Solr 4] from Apache Lucene EuroCon 2012.
+
+[[UpdatingPartsofDocuments-DocumentCentricVersioningConstraints]]
+== Document Centric Versioning Constraints
+
+Optimistic Concurrency is extremely powerful, and works very efficiently because it uses an internally assigned, globally unique values for the `_version_` field. However, In some situations users may want to configure their own document specific version field, where the version values are assigned on a per-document basis by an external system, and have Solr reject updates that attempt to replace a document with an "older" version. In situations like this the {solr-javadocs}/solr-core/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.html[`DocBasedVersionConstraintsProcessorFactory`] can be useful.
+
+The basic usage of `DocBasedVersionConstraintsProcessorFactory` is to configure it in `solrconfig.xml` as part of the http://wiki.apache.org/solr/UpdateRequestProcessor[UpdateRequestProcessorChain] and specify the name of your custom `versionField` in your schema that should be checked when validating updates:
+
+[source,xml]
+----
+<processor class="solr.DocBasedVersionConstraintsProcessorFactory">
+  <str name="versionField">my_version_l</str>
+</processor>
+----
+
+Once configured, this update processor will reject (HTTP error code 409) any attempt to update an existing document where the value of the `my_version_l` field in the "new" document is not greater then the value of that field in the existing document.
+
+.versionField vs _version_
+[IMPORTANT]
+====
+
+The `_version_` field used by Solr for its normal optimistic concurrency also has important semantics in how updates are distributed to replicas in SolrCloud, and *MUST* be assigned internally by Solr. Users can not re-purpose that field and specify it as the `versionField` for use in the `DocBasedVersionConstraintsProcessorFactory` configuration.
+
+====
+
+`DocBasedVersionConstraintsProcessorFactory` supports two additional configuration params which are optional:
+
+* `ignoreOldUpdates` - A boolean option which defaults to `false`. If set to `true` then instead of rejecting updates where the `versionField` is too low, the update will be silently ignored (and return a status 200 to the client).
+* `deleteVersionParam` - A String parameter that can be specified to indicate that this processor should also inspect Delete By Id commands. The value of this configuration option should be the name of a request parameter that the processor will now consider mandatory for all attempts to Delete By Id, and must be be used by clients to specify a value for the `versionField` which is greater then the existing value of the document to be deleted. When using this request param, any Delete By Id command with a high enough document version number to succeed will be internally converted into an Add Document command that replaces the existing document with a new one which is empty except for the Unique Key and `versionField` to keeping a record of the deleted version so future Add Document commands will fail if their "new" version is not high enough.
+
+Please consult the {solr-javadocs}/solr-core/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.html[DocBasedVersionConstraintsProcessorFactory javadocs] and https://git1-us-west.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/test-files/solr/collection1/conf/solrconfig-externalversionconstraint.xml;hb=HEAD[test solrconfig.xml file] for additional information and example usages.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/45a148a7/solr/solr-ref-guide/src/upgrading-a-solr-cluster.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/upgrading-a-solr-cluster.adoc b/solr/solr-ref-guide/src/upgrading-a-solr-cluster.adoc
new file mode 100644
index 0000000..3c75c32
--- /dev/null
+++ b/solr/solr-ref-guide/src/upgrading-a-solr-cluster.adoc
@@ -0,0 +1,96 @@
+= Upgrading a Solr Cluster
+:page-shortname: upgrading-a-solr-cluster
+:page-permalink: upgrading-a-solr-cluster.html
+:page-children: indexupgrader-tool
+
+This page covers how to upgrade an existing Solr cluster that was installed using the <<taking-solr-to-production.adoc#taking-solr-to-production,service installation scripts>>.
+
+[IMPORTANT]
+====
+
+The steps outlined on this page assume you use the default service name of " `solr`". If you use an alternate service name or Solr installation directory, some of the paths and commands mentioned below will have to be modified accordingly.
+
+====
+
+[[UpgradingaSolrCluster-PlanningYourUpgrade]]
+== Planning Your Upgrade
+
+Here is a checklist of things you need to prepare before starting the upgrade process:
+
+// TODO: This 'ol' has problematic nested lists inside of it, needs manual editing
+
+1.  Examine the <<upgrading-solr.adoc#upgrading-solr,Upgrading Solr>> page to determine if any behavior changes in the new version of Solr will affect your installation.
+2.  If not using replication (ie: collections with replicationFactor > 1), then you should make a backup of each collection. If all of your collections use replication, then you don't technically need to make a backup since you will be upgrading and verifying each node individually.
+3.  Determine which Solr node is currently hosting the Overseer leader process in SolrCloud, as you should upgrade this node last. To determine the Overseer, use the Overseer Status API, see: <<collections-api.adoc#collections-api,Collections API>>.
+4.  Plan to perform your upgrade during a system maintenance window if possible. You'll be doing a rolling restart of your cluster (each node, one-by-one), but we still recommend doing the upgrade when system usage is minimal.
+5.  Verify the cluster is currently healthy and all replicas are active, as you should not perform an upgrade on a degraded cluster.
+6.  Re-build and test all custom server-side components against the new Solr JAR files.
+7.  Determine the values of the following variables that are used by the Solr Control Scripts:
+* `ZK_HOST`: The ZooKeeper connection string your current SolrCloud nodes use to connect to ZooKeeper; this value will be the same for all nodes in the cluster.
+* `SOLR_HOST`: The hostname each Solr node used to register with ZooKeeper when joining the SolrCloud cluster; this value will be used to set the *host* Java system property when starting the new Solr process.
+* `SOLR_PORT`: The port each Solr node is listening on, such as 8983.
+* `SOLR_HOME`: The absolute path to the Solr home directory for each Solr node; this directory must contain a `solr.xml` file. This value will be passed to the new Solr process using the `solr.solr.home` system property, see: <<solr-cores-and-solr-xml.adoc#solr-cores-and-solr-xml,Solr Cores and solr.xml>>.
++
+If you are upgrading from an installation of Solr 5.x or later, these values can typically be found in either `/var/solr/solr.in.sh` or `/etc/default/solr.in.sh`.
+
+You should now be ready to upgrade your cluster. Please verify this process in a test / staging cluster before doing it in production.
+
+[[UpgradingaSolrCluster-UpgradeProcess]]
+== Upgrade Process
+
+The approach we recommend is to perform the upgrade of each Solr node, one-by-one. In other words, you will need to stop a node, upgrade it to the new version of Solr, and restart it before moving on to the next node. This means that for a short period of time, there will be a mix of "Old Solr" and "New Solr" nodes running in your cluster. We also assume that you will point the new Solr node to your existing Solr home directory where the Lucene index files are managed for each collection on the node. This means that you won't need to move any index files around to perform the upgrade.
+
+// OLD_CONFLUENCE_ID: UpgradingaSolrCluster-Step1:StopSolr
+
+[[UpgradingaSolrCluster-Step1_StopSolr]]
+=== Step 1: Stop Solr
+
+Begin by stopping the Solr node you want to upgrade. After stopping the node, if using a replication, (ie: collections with replicationFactor > 1) verify that all leaders hosted on the downed node have successfully migrated to other replicas; you can do this by visiting the <<cloud-screens.adoc#cloud-screens,Cloud panel in the Solr Admin UI>>. If not using replication, then any collections with shards hosted on the downed node will be temporarily off-line.
+
+// OLD_CONFLUENCE_ID: UpgradingaSolrCluster-Step2:InstallSolrasaService
+
+[[UpgradingaSolrCluster-Step2_InstallSolrasaService]]
+=== Step 2: Install Solr as a Service
+
+Please follow the instructions to install Solr as a Service on Linux documented at <<taking-solr-to-production.adoc#taking-solr-to-production,Taking Solr to Production>>. Use the `-n` parameter to avoid automatic start of Solr by the installer script. You need to update the `/etc/default/solr.in.sh` include file in the next step to complete the upgrade process.
+
+[NOTE]
+====
+
+If you have a `/var/solr/solr.in.sh` file for your existing Solr install, running the `install_solr_service.sh` script will move this file to its new location: `/etc/default/solr.in.sh` (see https://issues.apache.org/jira/browse/SOLR-8101[SOLR-8101] for more details)
+
+====
+
+// OLD_CONFLUENCE_ID: UpgradingaSolrCluster-Step3:SetEnvironmentVariableOverrides
+
+[[UpgradingaSolrCluster-Step3_SetEnvironmentVariableOverrides]]
+=== Step 3: Set Environment Variable Overrides
+
+Open `/etc/default/solr.in.sh` with a text editor and verify that the following variables are set correctly, or add them bottom of the include file as needed:
+
+`ZK_HOST=SOLR_HOST=SOLR_PORT=SOLR_HOME=`
+
+Make sure the user you plan to own the Solr process is the owner of the `SOLR_HOME` directory. For instance, if you plan to run Solr as the "solr" user and `SOLR_HOME` is `/var/solr/data`, then you would do: `sudo chown -R solr: /var/solr/data`
+
+// OLD_CONFLUENCE_ID: UpgradingaSolrCluster-Step4:StartSolr
+
+[[UpgradingaSolrCluster-Step4_StartSolr]]
+=== Step 4: Start Solr
+
+You are now ready to start the upgraded Solr node by doing: `sudo service solr start`. The upgraded instance will join the existing cluster because you're using the same `SOLR_HOME`, `SOLR_PORT`, and `SOLR_HOST` settings used by the old Solr node; thus, the new server will look like the old node to the running cluster. Be sure to look in `/var/solr/logs/solr.log` for errors during startup.
+
+// OLD_CONFLUENCE_ID: UpgradingaSolrCluster-Step5:RunHealthcheck
+
+[[UpgradingaSolrCluster-Step5_RunHealthcheck]]
+=== Step 5: Run Healthcheck
+
+You should run the Solr *healthcheck* command for all collections that are hosted on the upgraded node before proceeding to upgrade the next node in your cluster. For instance, if the newly upgraded node hosts a replica for the *MyDocuments* collection, then you can run the following command (replace ZK_HOST with the ZooKeeper connection string):
+
+[source,java]
+----
+$ /opt/solr/bin/solr healthcheck -c MyDocuments -z ZK_HOST
+----
+
+Look for any problems reported about any of the replicas for the collection.
+
+Lastly, repeat Steps 1-5 for all nodes in your cluster.