You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by ct...@apache.org on 2018/09/06 17:20:56 UTC

[1/3] lucene-solr:branch_7x: SOLR-12684: put expression names and params in monospace

Repository: lucene-solr
Updated Branches:
  refs/heads/branch_7x a889dbd54 -> c85904288


SOLR-12684: put expression names and params in monospace


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/d6978717
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/d6978717
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/d6978717

Branch: refs/heads/branch_7x
Commit: d6978717c4ab161f1f6d597a4468302b2a38b24a
Parents: a889dbd
Author: Cassandra Targett <ct...@apache.org>
Authored: Wed Sep 5 20:35:37 2018 -0500
Committer: Cassandra Targett <ct...@apache.org>
Committed: Thu Sep 6 12:17:10 2018 -0500

----------------------------------------------------------------------
 .../src/stream-decorator-reference.adoc         | 23 ++++++++++----------
 1 file changed, 12 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/d6978717/solr/solr-ref-guide/src/stream-decorator-reference.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/stream-decorator-reference.adoc b/solr/solr-ref-guide/src/stream-decorator-reference.adoc
index b397192..08f1e7a 100644
--- a/solr/solr-ref-guide/src/stream-decorator-reference.adoc
+++ b/solr/solr-ref-guide/src/stream-decorator-reference.adoc
@@ -970,18 +970,20 @@ outerHashJoin(
 
 The `parallel` function wraps a streaming expression and sends it to N worker nodes to be processed in parallel.
 
-The parallel function requires that the `partitionKeys` parameter be provided to the underlying searches. The `partitionKeys` parameter will partition the search results (tuples) across the worker nodes. Tuples with the same values in the partitionKeys field will be shuffled to the same worker nodes.
+The `parallel` function requires that the `partitionKeys` parameter be provided to the underlying searches. The `partitionKeys` parameter will partition the search results (tuples) across the worker nodes. Tuples with the same values in for `partitionKeys` will be shuffled to the same worker nodes.
 
-The parallel function maintains the sort order of the tuples returned by the worker nodes, so the sort criteria of the parallel function must incorporate the sort order of the tuples returned by the workers.
+The `parallel` function maintains the sort order of the tuples returned by the worker nodes, so the sort criteria must incorporate the sort order of the tuples returned by the workers.
 
-For example if you sort on year, month and day you could partition on year only as long as there was enough different years to spread the tuples around the worker nodes.
-Solr allows sorting on more than 4 fields, but you cannot specify more than 4 partitionKeys for speed tradeoffs. Also it's an overkill to specify many partitionKeys when we one or two keys could be enough to spread the tuples.
-Parallel Stream was designed when the underlying search stream will emit a lot of tuples from the collection. If the search stream only emits a small subset of the data from the collection using parallel could potentially be slower.
+For example if you sort on year, month and day you could partition on year only as long as there are enough different years to spread the tuples around the worker nodes.
+
+Solr allows sorting on more than 4 fields, but you cannot specify more than 4 partitionKeys for speed considerations. Also it's overkill to specify many `partitionKeys` when we one or two keys could be enough to spread the tuples.
+
+Parallel stream was designed when the underlying search stream will emit a lot of tuples from the collection. If the search stream only emits a small subset of the data from the collection using `parallel` could potentially be slower.
 
 .Worker Collections
 [TIP]
 ====
-The worker nodes can be from the same collection as the data, or they can be a different collection entirely, even one that only exists for parallel streaming expressions. A worker collection can be any SolrCloud collection that has the `/stream` handler configured. Unlike normal SolrCloud collections, worker collections don't have to hold any data. Worker collections can be empty collections that exist only to execute streaming expressions.
+The worker nodes can be from the same collection as the data, or they can be a different collection entirely, even one that only exists for `parallel` streaming expressions. A worker collection can be any SolrCloud collection that has the `/stream` handler configured. Unlike normal SolrCloud collections, worker collections don't have to hold any data. Worker collections can be empty collections that exist only to execute streaming expressions.
 ====
 
 === parallel Parameters
@@ -1009,11 +1011,11 @@ The expression above shows a `parallel` function wrapping a `reduce` function. T
 .Warmup
 [TIP]
 ====
-The parallel stream uses the hash query parser to split the data amongst the workers. It executes on all the documents and the result bitset is cached in the filterCache.
-For a parallel stream with the same number of workers and partitonKeys the first query would be slower than subsequent queries.
+The `parallel` function uses the hash query parser to split the data amongst the workers. It executes on all the documents and the result bitset is cached in the filterCache.
++
+For a `parallel` stream with the same number of workers and `partitonKeys` the first query would be slower than subsequent queries.
 A trick to not pay the penalty for the first slow query would be to use a warmup query for every new searcher.
-The following is a solrconfig.xml snippet for 2 workers and "year_i" as the partionKeys.
-
+The following is a `solrconfig.xml` snippet for 2 workers and "year_i" as the `partionKeys`.
 
 [source,text]
 ----
@@ -1024,7 +1026,6 @@ The following is a solrconfig.xml snippet for 2 workers and "year_i" as the part
 </arr>
 </listener>
 ----
-
 ====
 
 == priority


[3/3] lucene-solr:branch_7x: SOLR-12716: Move common params to top of page; insert links to common param section for each trigger; improve consistency

Posted by ct...@apache.org.
SOLR-12716: Move common params to top of page; insert links to common param section for each trigger; improve consistency


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/c8590428
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/c8590428
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/c8590428

Branch: refs/heads/branch_7x
Commit: c85904288dd370f13c0a1287b2fcc38ff8a73159
Parents: a84f84c
Author: Cassandra Targett <ct...@apache.org>
Authored: Thu Sep 6 11:58:51 2018 -0500
Committer: Cassandra Targett <ct...@apache.org>
Committed: Thu Sep 6 12:20:49 2018 -0500

----------------------------------------------------------------------
 .../src/solrcloud-autoscaling-triggers.adoc     | 320 +++++++++++--------
 1 file changed, 181 insertions(+), 139 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/c8590428/solr/solr-ref-guide/src/solrcloud-autoscaling-triggers.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/solrcloud-autoscaling-triggers.adoc b/solr/solr-ref-guide/src/solrcloud-autoscaling-triggers.adoc
index 9c8aac5..0b41e21 100644
--- a/solr/solr-ref-guide/src/solrcloud-autoscaling-triggers.adoc
+++ b/solr/solr-ref-guide/src/solrcloud-autoscaling-triggers.adoc
@@ -1,4 +1,5 @@
 = SolrCloud Autoscaling Triggers
+:page-tocclass: right
 // Licensed to the Apache Software Foundation (ASF) under one
 // or more contributor license agreements.  See the NOTICE file
 // distributed with this work for additional information
@@ -16,31 +17,27 @@
 // specific language governing permissions and limitations
 // under the License.
 
-Triggers are used in autoscaling to watch for cluster events such as nodes joining, leaving, search rate or any other metric breaching a threshold.
-
-In the future other cluster, node, and replica events that are important from the
-point of view of cluster performance will also have available triggers.
+Triggers are used in autoscaling to watch for cluster events such as nodes joining or leaving, search rate, index rate, on a schedule, or any other metric breaching a threshold.
 
 Trigger implementations verify the state of resources that they monitor. When they detect a
-change that merits attention they generate events, which are then queued and processed by configured
-`TriggerAction` implementations. This usually involves computing and executing a plan to manage the new cluster
-resources (e.g., move replicas). Solr provides predefined implementations of triggers for specific event types.
+change that merits attention they generate _events_, which are then queued and processed by configured
+`TriggerAction` implementations. This usually involves computing and executing a plan to do something (e.g., move replicas). Solr provides predefined implementations of triggers for <<Event Types,specific event types>>.
 
-Triggers execute on the node that runs `Overseer`. They are scheduled to run periodically, at a default interval of 1 second between each execution (not every execution produces events).
+Triggers execute on the node that runs `Overseer`. They are scheduled to run periodically, at a default interval of 1 second between each execution (although it's important to note that not every execution of a trigger produces events).
 
 == Event Types
 Currently the following event types (and corresponding trigger implementations) are defined:
 
-* `nodeAdded`: generated when a node joins the cluster
-* `nodeLost`: generated when a node leaves the cluster
-* `metric`: generated when the configured metric crosses a configured lower or upper threshold value
+* `nodeAdded`: generated when a node joins the cluster. See <<Node Added Trigger>>.
+* `nodeLost`: generated when a node leaves the cluster. See <<Node Lost Trigger>> and <<Auto Add Replicas Trigger>>.
+* `metric`: generated when the configured metric crosses a configured lower or upper threshold value. See <<Metric Trigger>>.
 * `indexSize`: generated when a shard size (defined as index size in bytes or number of documents)
-exceeds upper or lower threshold values
-* `searchRate`: generated when the search rate exceeds configured upper or lower thresholds
-* `scheduled`: generated according to a scheduled time period such as every 24 hours, etc
+exceeds upper or lower threshold values. See <<Index Size Trigger>>.
+* `searchRate`: generated when the search rate exceeds configured upper or lower thresholds. See <<Search Rate Trigger>>.
+* `scheduled`: generated according to a scheduled time period such as every 24 hours, etc. See <<Scheduled Trigger>>.
 
-Events are not necessarily generated immediately after the corresponding state change occurred - the
-maximum rate of events is controlled by the `waitFor` configuration parameter (see below).
+Events are not necessarily generated immediately after the corresponding state change occurred; the
+maximum rate of events is controlled by the `waitFor` configuration parameter (see <<Trigger Configuration>> below for more explanation).
 
 The following properties are common to all event types:
 
@@ -57,12 +54,82 @@ generated, which may significantly differ due to the rate limits set by `waitFor
 `properties`:: (map, optional) Any additional properties. Currently includes e.g., `nodeNames` property that
 indicates the nodes that were lost or added.
 
-== Node Added Trigger
+== Trigger Configuration
+Trigger configurations are managed using the <<solrcloud-autoscaling-api.adoc#write-api,Autoscaling Write API>> with the commands `<<solrcloud-autoscaling-api.adoc#create-update-trigger,set-trigger>>`, `<<solrcloud-autoscaling-api.adoc#remove-trigger,remove-trigger>>`,
+`suspend-trigger`, and `resume-trigger`.
+
+=== Trigger Properties
+
+Trigger configuration consists of the following properties:
+
+`name`:: (string, required) A unique trigger configuration name.
+
+`event`:: (string, required) One of the predefined event types (`nodeAdded` or `nodeLost`).
+
+`actions`:: (list of action configs, optional) An ordered list of actions to execute when event is fired.
+
+`waitFor`:: (string, optional) The time to wait between generating new events, as an integer number immediately
+followed by unit symbol, one of `s` (seconds), `m` (minutes), or `h` (hours). Default is `0s`. A condition must
+persist at least for the `waitFor` period to generate an event.
+
+`enabled`:: (boolean, optional) When `true` the trigger is enabled. Default is `true`.
+
+Additional implementation-specific properties may be provided, as described in the sections for individual triggers below.
+
+=== Action Properties
+
+Action configuration consists of the following properties:
+
+`name`:: (string, required) A unique name of the action configuration.
+
+`class`:: (string, required) The action implementation class.
+
+Additional implementation-specific properties may be provided, as described in the sections for individual triggers below.
+
+If the `actions` configuration is omitted, then by default, the `ComputePlanAction` and the `ExecutePlanAction` are automatically added to the trigger configuration.
+
+=== Example Trigger Configuration
+
+This simple example shows the configuration for adding (or updating) a trigger for `nodeAdded` events.
+
+[source,json]
+----
+{
+ "set-trigger": {
+  "name" : "node_added_trigger",
+  "event" : "nodeAdded",
+  "waitFor" : "1s",
+  "enabled" : true,
+  "actions" : [
+   {
+    "name" : "compute_plan",
+    "class": "solr.ComputePlanAction"
+   },
+   {
+    "name" : "custom_action",
+    "class": "com.example.CustomAction"
+   },
+   {
+    "name" : "execute_plan",
+    "class": "solr.ExecutePlanAction"
+   }
+  ]
+ }
+}
+----
+
+This trigger configuration will compute and execute a plan to allocate the resources available on the new node. A custom action could also be used to possibly modify the plan.
+
+== Available Triggers
+
+As described earlier, there are several triggers available to watch for events.
+
+=== Node Added Trigger
 
 The `NodeAddedTrigger` generates `nodeAdded` events when a node joins the cluster. It can be used to either move replicas
 from other nodes to the new node or to add new replicas.
 
-Apart from the parameters described at <<#trigger-configuration, Trigger Configuration>>, this trigger supports the following configuration:
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports one more parameter:
 
 `preferredOperation`:: (string, optional, defaults to `movereplica`) The operation to be performed in response to an event generated by this trigger. By default, replicas will be moved from other nodes to the added node. The only other supported value is `addreplica` which adds more replicas of the existing collections on the new node.
 
@@ -91,12 +158,12 @@ Apart from the parameters described at <<#trigger-configuration, Trigger Configu
 }
 ----
 
-== Node Lost Trigger
+=== Node Lost Trigger
 
 The `NodeLostTrigger` generates `nodeLost` events when a node leaves the cluster. It can be used to either move replicas
 that were hosted by the lost node to other nodes or to delete them from the cluster.
 
-Apart from the parameters described at <<#trigger-configuration, Trigger Configuration>>, this trigger supports the following configuration:
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports the one more parameter:
 
 `preferredOperation`:: (string, optional, defaults to `MOVEREPLICA`) The operation to be performed in response to an event generated by this trigger. By default, replicas will be moved from the lost nodes to the other nodes in the cluster. The only other supported value is `DELETENODE` which deletes all information about replicas that were hosted by the lost node.
 
@@ -125,9 +192,9 @@ Apart from the parameters described at <<#trigger-configuration, Trigger Configu
 }
 ----
 
-TIP: It is recommended that the value of `waitFor` configuration for node lost trigger be larger than a minute so that large full garbage collection pauses do not cause this trigger to generate events and needlessly move or delete replicas in the cluster.
+TIP: It is recommended that the value of `waitFor` configuration for the node lost trigger be larger than 1 minute so that large full garbage collection pauses do not cause this trigger to generate events and needlessly move or delete replicas in the cluster.
 
-== Auto Add Replicas Trigger
+=== Auto Add Replicas Trigger
 
 When a collection has the parameter `autoAddReplicas` set to true then a trigger configuration named `.auto_add_replicas` is automatically created to watch for nodes going away. This trigger produces `nodeLost` events,
 which are then processed by configured actions (usually resulting in computing and executing a plan
@@ -135,32 +202,39 @@ to add replicas on the live nodes to maintain the expected replication factor).
 
 Refer to the section <<solrcloud-autoscaling-auto-add-replicas.adoc#solrcloud-autoscaling-auto-add-replicas, Autoscaling Automatically Adding Replicas>> to learn more about how the `.autoAddReplicas` trigger works.
 
-This trigger supports one parameter, which is defined in the `<solrcloud>` section of `solr.xml`:
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports one parameter, which is defined in the `<solrcloud>` section of `solr.xml`:
 
 `autoReplicaFailoverWaitAfterExpiration`::
 The minimum time in milliseconds to wait for initiating replacement of a replica after first noticing it not being live. This is important to prevent false positives while stopping or starting the cluster. The default is `120000` (2 minutes). The value provided for this parameter is used as the value for the `waitFor` parameter in the `.auto_add_replicas` trigger.
 
 TIP: See <<format-of-solr-xml.adoc#the-solrcloud-element,The <solrcloud> Element>> for more details about how to work with `solr.xml`.
 
-== Metric Trigger
+=== Metric Trigger
 
 The metric trigger can be used to monitor any metric exposed by the <<metrics-reporting.adoc#metrics-reporting,Metrics API>>. It supports lower and upper threshold configurations as well as optional filters to limit operation to specific collection, shards, and nodes.
 
-This trigger supports the following configuration:
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports the following parameters:
 
-`metric`:: (string, required) The metric property name to be watched in the format metric:__group__:__prefix__, e.g., `metric:solr.node:CONTAINER.fs.coreRoot.usableSpace`.
+`metric`::
+(string, required) The metric property name to be watched in the format metric:__group__:__prefix__, e.g., `metric:solr.node:CONTAINER.fs.coreRoot.usableSpace`.
 
-`below`:: (double, optional) The lower threshold for the metric value. The trigger produces a metric breached event if the metric's value falls below this value.
+`below`::
+(double, optional) The lower threshold for the metric value. The trigger produces a metric breached event if the metric's value falls below this value.
 
-`above`:: (double, optional) The upper threshold for the metric value. The trigger produces a metric breached event if the metric's value crosses above this value.
+`above`::
+(double, optional) The upper threshold for the metric value. The trigger produces a metric breached event if the metric's value crosses above this value.
 
-`collection`:: (string, optional) The collection used to limit the nodes on which the given metric is watched. When the metric is breached, trigger actions will limit operations to this collection only.
+`collection`::
+(string, optional) The collection used to limit the nodes on which the given metric is watched. When the metric is breached, trigger actions will limit operations to this collection only.
 
-`shard`:: (string, optional) The shard used to limit the nodes on which the given metric is watched. When the metric is breached, trigger actions will limit operations to this shard only.
+`shard`::
+(string, optional) The shard used to limit the nodes on which the given metric is watched. When the metric is breached, trigger actions will limit operations to this shard only.
 
-`node`:: (string, optional) The node on which the given metric is watched. Trigger actions will operate on this node only.
+`node`::
+(string, optional) The node on which the given metric is watched. Trigger actions will operate on this node only.
 
-`preferredOperation`:: (string, optional, defaults to `MOVEREPLICA`) The operation to be performed in response to an event generated by this trigger. By default, replicas will be moved from the hot node to others. The only other supported value is `ADDREPLICA` which adds more replicas if the metric is breached.
+`preferredOperation`::
+(string, optional, defaults to `MOVEREPLICA`) The operation to be performed in response to an event generated by this trigger. By default, replicas will be moved from the hot node to others. The only other supported value is `ADDREPLICA` which adds more replicas if the metric is breached.
 
 .Example: a metric trigger that fires when total usable space on a node having replicas of "mycollection" falls below 100GB
 [source,json]
@@ -177,7 +251,7 @@ This trigger supports the following configuration:
 }
 ----
 
-== Index Size Trigger
+=== Index Size Trigger
 This trigger can be used for monitoring the size of collection shards, measured either by the
 number of documents in a shard or the physical size of the shard's index in bytes.
 
@@ -193,34 +267,41 @@ that operation is not yet implemented (see https://issues.apache.org/jira/browse
 Additionally, monitoring can be restricted to a list of collections; by default
 all collections are monitored.
 
-This trigger supports the following configuration parameters (all thresholds are exclusive):
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports the following configuration parameters (all thresholds are exclusive):
 
-`aboveBytes`:: upper threshold in bytes. This value is compared to the `INDEX.sizeInBytes` metric.
+`aboveBytes`::
+A upper threshold in bytes. This value is compared to the `INDEX.sizeInBytes` metric.
 
-`belowBytes`:: lower threshold in bytes. Note that this value should be at least 2x smaller than
+`belowBytes`::
+A lower threshold in bytes. Note that this value should be at least 2x smaller than
 `aboveBytes`
 
-`aboveDocs`:: upper threshold expressed as the number of documents. This value is compared with `SEARCHER.searcher.numDocs` metric.
+`aboveDocs`::
+An upper threshold expressed as the number of documents. This value is compared with `SEARCHER.searcher.numDocs` metric.
 +
 NOTE: Due to the way Lucene indexes work, a shard may exceed the `aboveBytes` threshold
 even if the number of documents is relatively small, because replaced and deleted documents keep
 occupying disk space until they are actually removed during Lucene index merging.
 
-`belowDocs`:: lower threshold expressed as the number of documents.
+`belowDocs`::
+A lower threshold expressed as the number of documents.
 
-`aboveOp`:: operation to request when an upper threshold is exceeded. If not specified the
+`aboveOp`::
+The operation to request when an upper threshold is exceeded. If not specified the
 default value is `SPLITSHARD`.
 
-`belowOp`:: operation to request when a lower threshold is exceeded. If not specified
+`belowOp`::
+The operation to request when a lower threshold is exceeded. If not specified
 the default value is `MERGESHARDS` (but see the note above).
 
-`collections`:: comma-separated list of collection names that this trigger should monitor. If not
+`collections`::
+A comma-separated list of collection names that this trigger should monitor. If not
 specified or empty all collections are monitored.
 
 Events generated by this trigger contain additional details about the shards
 that exceeded thresholds and the types of violations (upper / lower bounds, bytes / docs metrics).
 
-.Example:
+.Example: Index Size Trigger
 This configuration specifies an index size trigger that monitors collections "test1" and "test2",
 with both bytes (1GB) and number of docs (1 million) upper limits, and a custom `belowOp`
 operation `NONE` (which still can be monitored and acted upon by an appropriate trigger listener):
@@ -253,7 +334,7 @@ operation `NONE` (which still can be monitored and acted upon by an appropriate
 }
 ----
 
-== Search Rate Trigger
+=== Search Rate Trigger
 
 The search rate trigger can be used for monitoring search rates in a selected
 collection (1-min average rate by default), and request that either replicas be moved from
@@ -269,82 +350,101 @@ This method was chosen to avoid generating false events when a simple client kee
 to a single specific replica (because adding or removing other replicas can't solve this situation,
 only proper load balancing can - either by using `CloudSolrClient` or another load-balancing client).
 
-Note: this trigger calculates node-level cumulative rates using per-replica rates reported by
+This trigger calculates node-level cumulative rates using per-replica rates reported by
 replicas that are part of monitored collections / shards on each node. This means that it may report
 some nodes as "cold" (underutilized) because it ignores other, perhaps more active, replicas
 belonging to other collections. Also, nodes that don't host any of the monitored replicas or
 those that are explicitly excluded by `node` config property won't be reported at all.
 
-Note 2: special care should be taken when configuring `waitFor` property. By default the trigger
-monitors a 1-min average search rate of a replica. Changes to the number of replicas that should in turn
+.Calculating `waitFor`
+[CAUTION]
+====
+Special care should be taken when configuring the `waitFor` property. By default the trigger
+monitors a 1-minute average search rate of a replica. Changes to the number of replicas that should in turn
 change per-replica search rates may be requested and executed relatively quickly if the
-`waitFor` is set to comparable values of 1 min or shorter. However, the metric value, being a
-moving average, will always lag behind the new "momentary" rate after the changes. This in turn means that
-the monitored metric may not change sufficiently enough to prevent the
-trigger from firing again (because it will continue to measure the average rate as still violating
-the threshold for some time after the change was executed). As a result the trigger may keep
+`waitFor` is set to comparable values of 1 min or shorter.
+
+However, the metric value, being a moving average, will always lag behind the new "momentary" rate after the changes. This in turn means that the monitored metric may not change sufficiently enough to prevent the
+trigger from firing again, because it will continue to measure the average rate as still violating
+the threshold for some time after the change was executed. As a result the trigger may keep
 requesting that even more replicas be added (or removed) and thus it may "overshoot" the optimal number of replicas.
+
 For this reason it's recommended to always set `waitFor` to values several
-times longer than the time constant of the used metric. For example, with the default 1-min average the
-`waitFor` should be set to at least `2m` or more.
+times longer than the time constant of the used metric. For example, with the default 1-minute average the
+`waitFor` should be set to at least `2m` (2 minutes) or more.
+====
 
-This trigger supports the following configuration properties:
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports the following configuration properties:
 
-`collections`:: (string, optional) comma-separated list of collection names to monitor, or any collection if empty / not set.
+`collections`::
+(string, optional) A comma-separated list of collection names to monitor, or any collection if empty or not set.
 
-`shard`:: (string, optional) shard name within the collection (requires `collections` to be set to exactly one name), or any shard if empty.
+`shard`::
+(string, optional) A shard name within the collection (requires `collections` to be set to exactly one name), or any shard if empty.
 
-`node`:: (string, optional) node name to monitor, or any if empty.
+`node`::
+(string, optional) A node name to monitor, or any if empty.
 
-`metric`:: (string, optional) metric name that represents the search rate
-(default is `QUERY./select.requestTimes:1minRate`). This name has to identify a single numeric
-metric value, and it may use the colon syntax for selecting one property of a complex metric. This value
-is collected from all replicas for a shard, and then an arithmetic average is calculated per shard
-to determine shard-level violations.
+`metric`::
+(string, optional) A metric name that represents the search rate. The default is `QUERY./select.requestTimes:1minRate`. This name has to identify a single numeric metric value, and it may use the colon syntax for selecting one property of
+a complex metric. This value is collected from all replicas for a shard, and then an arithmetic average is calculated
+per shard to determine shard-level violations.
 
-`maxOps`:: (integer, optional) maximum number of add replica / delete replica operations
-requested in a single autoscaling event. The default value is 3 and it helps to smooth out
+`maxOps`::
+(integer, optional) The maximum number of `ADDREPLICA` or `DELETEREPLICA` operations
+requested in a single autoscaling event. The default value is `3` and it helps to smooth out
 the changes to the number of replicas during periods of large search rate fluctuations.
 
-`minReplicas`:: (integer, optional) minimum acceptable number of searchable replicas (i.e., replicas other
-than `PULL` type). The trigger will not generate any DELETEREPLICA requests when the number of
-searchable replicas in a shard reaches this threshold. When this value is not set (the default)
+`minReplicas`::
+(integer, optional) The minimum acceptable number of searchable replicas (i.e., replicas other
+than `PULL` type). The trigger will not generate any `DELETEREPLICA` requests when the number of
+searchable replicas in a shard reaches this threshold.
++
+When this value is not set (the default)
 the `replicationFactor` property of the collection is used, and if that property is not set then
-the value is set to 1. Note also that shard leaders are never deleted.
+the value is set to `1`. Note also that shard leaders are never deleted.
 
-`aboveRate`:: (float) the upper bound for the request rate metric value. At least one of
+`aboveRate`::
+(float) The upper bound for the request rate metric value. At least one of
 `aboveRate` or `belowRate` must be set.
 
-`belowRate`:: (float) the lower bound for the request rate metric value. At least one of
+`belowRate`::
+(float) The lower bound for the request rate metric value. At least one of
 `aboveRate` or `belowRate` must be set.
 
-`aboveNodeRate`:: (float) the upper bound for the total request rate metric value per node. If not
+`aboveNodeRate`::
+(float) The upper bound for the total request rate metric value per node. If not
 set then cumulative per-node rates will be ignored.
 
-`belowNodeRate`:: (float) the lower bound for the total request rate metric value per node. If not
+`belowNodeRate`::
+(float) The lower bound for the total request rate metric value per node. If not
 set then cumulative per-node rates will be ignored.
 
-`aboveOp`:: (string, optional) collection action to request when the upper threshold for a shard is
+`aboveOp`::
+(string, optional) A collection action to request when the upper threshold for a shard is
 exceeded. Default action is `ADDREPLICA` and the trigger will request from 1 up to `maxOps` operations
 per shard per event, proportionally to how much the rate is exceeded. This property can be set to 'NONE'
 to effectively disable the action but still report it to the listeners.
 
-`aboveNodeOp`:: (string, optional) collection action to request when the upper threshold for a node (`aboveNodeRate`) is exceeded.
+`aboveNodeOp`::
+(string, optional) The collection action to request when the upper threshold for a node (`aboveNodeRate`) is exceeded.
 Default action is `MOVEREPLICA`, and the trigger will request 1 replica operation per hot node per event.
 If both `aboveOp` and `aboveNodeOp` operations are to be requested then `aboveNodeOp` operations are
 always requested first, and only if no `aboveOp` (shard level) operations are to be requested (because `aboveOp`
 operations will change node-level rates anyway). This property can be set to 'NONE' to effectively disable
 the action but still report it to the listeners.
 
-`belowOp`:: (string, optional) collection action to request when the lower threshold for a shard is
+`belowOp`::
+(string, optional) The collection action to request when the lower threshold for a shard is
 exceeded. Default action is `DELETEREPLICA`, and the trigger will request at most `maxOps` replicas
 to be deleted from eligible cold shards. This property can be set to 'NONE'
 to effectively disable the action but still report it to the listeners.
 
-`belowNodeOp`:: action to request when the lower threshold for a node (`belowNodeRate`) is exceeded.
+`belowNodeOp`::
+(string, optional) The action to request when the lower threshold for a node (`belowNodeRate`) is exceeded.
 Default action is null (not set) and the condition is ignored, because in many cases the
 trigger will monitor only some selected resources (replicas from selected
-collections / shards) so setting this by default to e.g., `DELETENODE` could interfere with
+collections or shards) so setting this by default to e.g., `DELETENODE` could interfere with
 these non-monitored resources. The trigger will request 1 operation per cold node per event.
 If both `belowOp` and `belowNodeOp` operations are requested then `belowOp` operations are
 always requested first.
@@ -355,6 +455,7 @@ average request rate of "/select" handler exceeds 100 requests/sec, and the cond
 for over 20 minutes. If the rate falls below 0.01 and persists for 20 min the trigger will
 request not only replica deletions (leaving at most 1 replica per shard) but also it may
 request node deletion.
+
 [source,json]
 ----
 {
@@ -384,11 +485,11 @@ request node deletion.
 }
 ----
 
-== Scheduled Trigger
+=== Scheduled Trigger
 
 The Scheduled trigger generates events according to a fixed rate schedule.
 
-The trigger supports the following configuration:
+In addition to the parameters described at <<Trigger Configuration>>, this trigger supports the following configuration:
 
 `startTime`::
 (string, required) The start date/time of the schedule. This should either be a DateMath string e.g., 'NOW', or be an ISO-8601 date time string (the same standard used during search and indexing in Solr, which defaults to UTC), or be specified without the trailing 'Z' accompanied with the `timeZone` parameter. For example, each of the following values are acceptable:
@@ -410,65 +511,6 @@ The trigger supports the following configuration:
 
 This trigger applies the `every` date math expression on the `startTime` or the last event time to derive the next scheduled time and if current time is greater than next scheduled time but within `graceTime` then an event is generated.
 
-Apart from the common event properties described in the Event Types section, the trigger adds an additional `actualEventTime` event property which has the actual event time as opposed to the scheduled time.
+Apart from the common event properties described in the <<Event Types>> section, the trigger adds an additional `actualEventTime` event property which has the actual event time as opposed to the scheduled time.
 
 For example, if the scheduled time was `2018-01-31T15:30:00Z` and grace time was `+15MINUTES` then an event may be fired at `2018-01-31T15:45:00Z`. Such an event will have `eventTime` as `2018-01-31T15:30:00Z`, the scheduled time, but the `actualEventTime` property will have a value of `2018-01-31T15:45:00Z`, the actual time.
-
-== Trigger Configuration
-Trigger configurations are managed using the Autoscaling Write API and the commands `set-trigger`, `remove-trigger`,
-`suspend-trigger`, and `resume-trigger`.
-
-Trigger configuration consists of the following properties:
-
-`name`:: (string, required) A unique trigger configuration name.
-
-`event`:: (string, required) One of the predefined event types (`nodeAdded` or `nodeLost`).
-
-`actions`:: (list of action configs, optional) An ordered list of actions to execute when event is fired.
-
-`waitFor`:: (string, optional) The time to wait between generating new events, as an integer number immediately
-followed by unit symbol, one of `s` (seconds), `m` (minutes), or `h` (hours). Default is `0s`. A condition must
-persist at least for the `waitFor` period to generate an event.
-
-`enabled`:: (boolean, optional) When `true` the trigger is enabled. Default is `true`.
-
-Additional implementation-specific properties may be provided.
-
-Action configuration consists of the following properties:
-
-`name`:: (string, required) A unique name of the action configuration.
-
-`class`:: (string, required) The action implementation class.
-
-Additional implementation-specific properties may be provided
-
-If the `actions` configuration is omitted, then by default, the `ComputePlanAction` and the `ExecutePlanAction` are automatically added to the trigger configuration.
-
-.Example: adding or updating a trigger for `nodeAdded` events
-[source,json]
-----
-{
- "set-trigger": {
-  "name" : "node_added_trigger",
-  "event" : "nodeAdded",
-  "waitFor" : "1s",
-  "enabled" : true,
-  "actions" : [
-   {
-    "name" : "compute_plan",
-    "class": "solr.ComputePlanAction"
-   },
-   {
-    "name" : "custom_action",
-    "class": "com.example.CustomAction"
-   },
-   {
-    "name" : "execute_plan",
-    "class": "solr.ExecutePlanAction"
-   }
-  ]
- }
-}
-----
-
-This trigger configuration will compute and execute a plan to allocate the resources available on the new node. A custom action is also used to possibly modify the plan.


[2/3] lucene-solr:branch_7x: SOLR-12722: expand "params" -> "parameters" (plus a bunch of other things I found in unrelated transformer examples)

Posted by ct...@apache.org.
SOLR-12722: expand "params" -> "parameters" (plus a bunch of other things I found in unrelated transformer examples)


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/a84f84c2
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/a84f84c2
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/a84f84c2

Branch: refs/heads/branch_7x
Commit: a84f84c2f65f714cb003a6c2af730d32fa75f2e7
Parents: d697871
Author: Cassandra Targett <ct...@apache.org>
Authored: Thu Sep 6 08:54:37 2018 -0500
Committer: Cassandra Targett <ct...@apache.org>
Committed: Thu Sep 6 12:20:03 2018 -0500

----------------------------------------------------------------------
 .../src/transforming-result-documents.adoc      | 53 +++++++++++++-------
 1 file changed, 35 insertions(+), 18 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/a84f84c2/solr/solr-ref-guide/src/transforming-result-documents.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/transforming-result-documents.adoc b/solr/solr-ref-guide/src/transforming-result-documents.adoc
index b567f28..4577725 100644
--- a/solr/solr-ref-guide/src/transforming-result-documents.adoc
+++ b/solr/solr-ref-guide/src/transforming-result-documents.adoc
@@ -16,7 +16,7 @@
 // specific language governing permissions and limitations
 // under the License.
 
-Document Transformers can be used to modify the information returned about each documents in the results of a query.
+Document Transformers modify the information returned about documents in the results of a query.
 
 == Using Document Transformers
 
@@ -67,7 +67,7 @@ The above query would produce results like the following:
   ...
 ----
 
-By default, values are returned as a String, but a "```t```" parameter can be specified using a value of int, float, double, or date to force a specific return type:
+By default, values are returned as a String, but a `t` parameter can be specified using a value of `int`, `float`, `double`, or `date` to force a specific return type:
 
 [source,plain]
 ----
@@ -86,7 +86,7 @@ In addition to using these request parameters, you can configure additional name
 </transformer>
 ----
 
-The "```value```" option forces an explicit value to always be used, while the "```defaultValue```" option provides a default that can still be overridden using the "```v```" and "```t```" local parameters.
+The `value` option forces an explicit value to always be used, while the `defaultValue` option provides a default that can still be overridden using the `v` and `t` local parameters.
 
 
 === [explain] - ExplainAugmenterFactory
@@ -98,7 +98,7 @@ Augments each document with an inline explanation of its score exactly like the
 q=features:cache&fl=id,[explain style=nl]
 ----
 
-Supported values for `style` are `text`, and `html`, and `nl` which returns the information as structured data:
+Supported values for `style` are `text`, `html`, and `nl` which returns the information as structured data. Here is the output of the above request using `style=nl`:
 
 [source,json]
 ----
@@ -113,7 +113,7 @@ Supported values for `style` are `text`, and `html`, and `nl` which returns the
 }]}}]}}
 ----
 
-A default style can be configured by specifying an "args" parameter in your configuration:
+A default style can be configured by specifying an `args` parameter in your `solrconfig.xml` configuration:
 
 [source,xml]
 ----
@@ -133,11 +133,14 @@ fl=id,[child parentFilter=doc_type:book childFilter=doc_type:chapter limit=100]
 
 Note that this transformer can be used even though the query itself is not a <<other-parsers.adoc#block-join-query-parsers,Block Join query>>.
 
-When using this transformer, the `parentFilter` parameter must be specified, and works the same as in all Block Join Queries, additional optional parameters are:
+When using this transformer, the `parentFilter` parameter must be specified, and works the same as in all Block Join Queries. Additional optional parameters are:
 
 * `childFilter` - query to filter which child documents should be included, this can be particularly useful when you have multiple levels of hierarchical documents (default: all children)
 * `limit` - the maximum number of child documents to be returned per parent document (default: 10)
-* `fl` - the field list which the transformer is to return (default: all fields). There is a further limitation in which the fields here should be a subset of those specified by the top level fl param
+* `fl` - the field list which the transformer is to return. The default is all fields.
++
+There is a further limitation in which the fields here should be a subset of those specified by the top level `fl`
+parameter.
 
 
 === [shard] - ShardAugmenterFactory
@@ -187,7 +190,7 @@ fl=id,[elevated],[excluded]&excludeIds=GB18030TEST&elevateIds=6H500F0&markExclud
 
 === [json] / [xml]
 
-These transformers replace field value containing a string representation of a valid XML or JSON structure with the actual raw XML or JSON structure rather than just the string value. Each applies only to the specific writer, such that `[json]` only applies to `wt=json` and `[xml]` only applies to `wt=xml`.
+These transformers replace a field value containing a string representation of a valid XML or JSON structure with the actual raw XML or JSON structure instead of just the string value. Each applies only to the specific writer, such that `[json]` only applies to `wt=json` and `[xml]` only applies to `wt=xml`.
 
 [source,plain]
 ----
@@ -202,10 +205,11 @@ This transformer executes a separate query per transforming document passing doc
 * It must be given an unique name: `fl=*,children:[subquery]`
 * There might be a few of them, e.g., `fl=*,sons:[subquery],daughters:[subquery]`.
 * Every `[subquery]` occurrence adds a field into a result document with the given name, the value of this field is a document list, which is a result of executing subquery using document fields as an input.
-* Subquery would use `/select` search handler by default that causes error if it is not configured. This can be changed by supplying `foo.qt` parameter.
+* Subquery will use the `/select` search handler by default, and will return an error if `/select` is not configured. This can be changed by supplying `foo.qt` parameter.
 
-Here is how it looks like in various formats:
+Here is how it looks like using various formats:
 
+.XML
 [source,xml]
 ----
   <result name="response" numFound="2" start="0">
@@ -226,6 +230,7 @@ Here is how it looks like in various formats:
   ...
 ----
 
+.JSON
 [source,json]
 ----
 { "response":{
@@ -245,6 +250,7 @@ Here is how it looks like in various formats:
       }}]}}
 ----
 
+.SolrJ
 [source,java]
 ----
  SolrDocumentList subResults = (SolrDocumentList)doc.getFieldValue("children");
@@ -252,8 +258,9 @@ Here is how it looks like in various formats:
 
 ==== Subquery Result Fields
 
-To appear in subquery document list, a field should be specified both fl parameters, in main one fl (despite the main result documents have no this field) and in subquery's one e.g., `foo.fl`. 
-Of course, you can use wildcard in any or both of these parameters. For example, if field `title` should appear in categories subquery, it can be done via one of these ways.
+To appear in subquery document list, a field should be specified in both `fl` parameters: in the main `fl` (despite the main result documents have no this field), and in subquery's `fl` (e.g., `foo.fl`).
+
+Wildcards can be used in one or both of these parameters. For example, if field `title` should appear in categories subquery, it can be done via one of these ways:
 
 [source,plain]
 ----
@@ -267,27 +274,37 @@ fl=...*,categories:[subquery]&categories.fl=*&categories.q=...
 
 If a subquery is declared as `fl=*,foo:[subquery]`, subquery parameters are prefixed with the given name and period. For example:
 
-`q=*:*&fl=*,**foo**:[subquery]&**foo.**q=to be continued&**foo.**rows=10&**foo.**sort=id desc`
+[source,plain]
+q=*:*&fl=*,**foo**:[subquery]&**foo.**q=to be continued&**foo.**rows=10&**foo.**sort=id desc
 
 ==== Document Field as an Input for Subquery Parameters
 
-It's necessary to pass some document field values as a parameter for subquery. It's supported via implicit *`row.__fieldname__`* parameter, and can be (but might not only) referred via Local Parameters syntax: `q=namne:john&fl=name,id,depts:[subquery]&depts.q={!terms f=id **v=$row.dept_id**}&depts.rows=10`
+It's necessary to pass some document field values as a parameter for subquery. It's supported via an implicit *`row.__fieldname__`* parameter, and can be (but might not only) referred via local parameters syntax:
+
+[source,plain,subs="quotes"]
+q=name:john&fl=name,id,depts:[subquery]&depts.q={!terms f=id **v=$row.dept_id**}&depts.rows=10
 
 Here departments are retrieved per every employee in search result. We can say that it's like SQL `join ON emp.dept_id=dept.id`.
 
 Note, when a document field has multiple values they are concatenated with a comma by default. This can be changed with the local parameter `foo:[subquery separator=' ']`, this mimics *`{!terms}`* to work smoothly with it.
 
-To log substituted subquery request parameters, add the corresponding parameter names, as in `depts.logParamsList=q,fl,rows,**row.dept_id**`
+To log substituted subquery request parameters, add the corresponding parameter names, as in: `depts.logParamsList=q,fl,rows,**row.dept_id**`
 
 ==== Cores and Collections in SolrCloud
 
-Use `foo:[subquery fromIndex=departments]` to invoke subquery on another core on the same node, it's what *`{!join}`* does for non-SolrCloud mode. But in case of SolrCloud just (and only) explicitly specify its native parameters like `collection, shards` for subquery, e.g.:
+Use `foo:[subquery fromIndex=departments]` to invoke subquery on another core on the same node. This is what `{!join}` does for non-SolrCloud mode. But with SolrCloud, just (and only) explicitly specify its native parameters like `collection, shards` for subquery, e.g.:
 
-`q=*:*&fl=*,foo:[subquery]&foo.q=cloud&**foo.collection**=departments`
+[source,plain,subs="quotes"]
+q=\*:*&fl=\*,foo:[subquery]&foo.q=cloud&**foo.collection**=departments
 
 [IMPORTANT]
 ====
-If subquery collection has a different unique key field name (let's say `foo_id` at contrast to `id` in primary collection), add the following parameters to accommodate this difference: `foo.fl=id:foo_id&foo.distrib.singlePass=true`. Otherwise you'll get `NullPoniterException` from `QueryComponent.mergeIds`.
+If subquery collection has a different unique key field name (such as `foo_id` instead of `id` in the primary collection), add the following parameters to accommodate this difference:
+
+[source,plain]
+foo.fl=id:foo_id&foo.distrib.singlePass=true
+
+Otherwise you'll get `NullPointerException` from `QueryComponent.mergeIds`.
 ====