You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@lucene.apache.org by jb...@apache.org on 2018/10/28 17:15:28 UTC

[1/3] lucene-solr:branch_7x: SOLR-12913: Streaming Expressions / Math Expressions docs for 7.6 release

Repository: lucene-solr
Updated Branches:
  refs/heads/branch_7x 575d3dac0 -> 7259d782f


SOLR-12913: Streaming Expressions / Math Expressions docs for 7.6 release


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/3b4c0eb5
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/3b4c0eb5
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/3b4c0eb5

Branch: refs/heads/branch_7x
Commit: 3b4c0eb566bd753b82e02ff296d40d92bb386c46
Parents: 575d3da
Author: Joel Bernstein <jb...@apache.org>
Authored: Thu Oct 25 09:16:54 2018 -0400
Committer: Joel Bernstein <jb...@apache.org>
Committed: Sun Oct 28 13:14:35 2018 -0400

----------------------------------------------------------------------
 .../src/computational-geometry.adoc             | 188 +++++++++++++++++++
 solr/solr-ref-guide/src/curve-fitting.adoc      | 100 ++++++++++
 solr/solr-ref-guide/src/dsp.adoc                |  86 +++++++--
 .../src/images/math-expressions/sinewave.png    | Bin 0 -> 273253 bytes
 .../src/images/math-expressions/sinewave256.png | Bin 0 -> 369866 bytes
 solr/solr-ref-guide/src/math-expressions.adoc   |  15 +-
 solr/solr-ref-guide/src/matrix-math.adoc        |  53 ++++++
 solr/solr-ref-guide/src/statistics.adoc         |  87 +++++++++
 .../src/stream-decorator-reference.adoc         |  48 +++++
 solr/solr-ref-guide/src/vector-math.adoc        |  36 ++++
 10 files changed, 587 insertions(+), 26 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/computational-geometry.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/computational-geometry.adoc b/solr/solr-ref-guide/src/computational-geometry.adoc
new file mode 100644
index 0000000..263f6d0
--- /dev/null
+++ b/solr/solr-ref-guide/src/computational-geometry.adoc
@@ -0,0 +1,188 @@
+= Computational Geometry
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+
+This section of the math expressions user guide covers computational geometry
+functions.
+
+== Convex Hull
+
+A convex hull is the smallest convex set of points that encloses a data set. Math expressions has support for computing
+the convex hull of a 2D data set. Once a convex hull has been calculated, a set of math expression functions
+can be used to geometrically describe the convex hull.
+
+The `convexHull` function finds the convex hull of an observation matrix of 2D vectors.
+Each row of the matrix contains a 2D observation.
+
+In the example below a convex hull is calculated for a randomly generated set of 100 2D observations.
+
+Then the following functions are called on the convex result:
+
+-`getBaryCenter`: Returns the 2D point that is the bary center of the convex hull.
+
+-`getArea`: Returns the area of the convex hull.
+
+-`getBoundarySize`: Returns the boundary size of the convex hull.
+
+-`getVertices`: Returns 2D points that are the vertices of the convex hull.
+
+
+[source,text]
+----
+let(echo="baryCenter, area, boundarySize, vertices",
+    x=sample(normalDistribution(0, 20), 100),
+    y=sample(normalDistribution(0, 10), 100),
+    observations=transpose(matrix(x,y)),
+    chull=convexHull(observations),
+    baryCenter=getBaryCenter(chull),
+    area=getArea(chull),
+    boundarySize=getBoundarySize(chull),
+    vertices=getVertices(chull))
+----
+
+When this expression is sent to the `/stream` handler it responds with:
+
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "baryCenter": [
+          -3.0969292101230343,
+          1.2160948182691975
+        ],
+        "area": 3477.480599967595,
+        "boundarySize": 267.52419019533664,
+        "vertices": [
+          [
+            -66.17632818958485,
+            -8.394931552315256
+          ],
+          [
+            -47.556667594765216,
+            -16.940434013651263
+          ],
+          [
+            -33.13582183446102,
+            -17.30914425443977
+          ],
+          [
+            -9.97459859015698,
+            -17.795012801599654
+          ],
+          [
+            27.7705917246824,
+            -14.487224686587767
+          ],
+          [
+            54.689432954170236,
+            -1.3333371984299605
+          ],
+          [
+            35.97568654458672,
+            23.054169251772556
+          ],
+          [
+            -15.539456215337585,
+            19.811330468093704
+          ],
+          [
+            -17.05125031092752,
+            19.53581741341663
+          ],
+          [
+            -35.92010024412891,
+            15.126430698395572
+          ]
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 3
+      }
+    ]
+  }
+}
+----
+
+== Enclosing Disk
+
+The `enclosingDisk` function finds the smallest enclosing circle the encloses a 2D data set.
+Once an enclosing disk has been calculated, a set of math expression functions
+can be used to geometrically describe the enclosing disk.
+
+In the example below an enclosing disk is calculated for a randomly generated set of 1000 2D observations.
+
+Then the following functions are called on the enclosing disk result:
+
+-`getCenter`: Returns the 2D point that is the center of the disk.
+
+-`getRadius`: Returns the radius of the disk.
+
+-`getSupportPoints`: Returns the support points of the disk.
+
+[source,text]
+----
+let(echo="center, radius, support",
+    x=sample(normalDistribution(0, 20), 1000),
+    y=sample(normalDistribution(0, 20), 1000),
+    observations=transpose(matrix(x,y)),
+    disk=enclosingDisk(observations),
+    center=getCenter(disk),
+    radius=getRadius(disk),
+    support=getSupportPoints(disk))
+----
+
+When this expression is sent to the `/stream` handler it responds with:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "center": [
+          -6.668825009733749,
+          -2.9825450908240025
+        ],
+        "radius": 72.66109546907208,
+        "support": [
+          [
+            20.350992271739464,
+            64.46791279377014
+          ],
+          [
+            33.02079953093981,
+            57.880978456420365
+          ],
+          [
+            -44.7273247899923,
+            -64.87911518353323
+          ]
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 8
+      }
+    ]
+  }
+}
+----

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/curve-fitting.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/curve-fitting.adoc b/solr/solr-ref-guide/src/curve-fitting.adoc
index a86f7ea..dc66eeb 100644
--- a/solr/solr-ref-guide/src/curve-fitting.adoc
+++ b/solr/solr-ref-guide/src/curve-fitting.adoc
@@ -221,6 +221,106 @@ responds with:
 }
 ----
 
+== Harmonic Curve Fitting
+
+The `harmonicFit` function or `harmfit` (for short) fits a smooth line through control points of a sine wave.
+The `harmfit` function is passed x- and y-axes and fits a smooth curve to the data.
+If only a single array is provided it is treated as the y-axis and a sequence is generated
+for the x-axis.
+
+The example below shows `harmfit` fitting a single oscillation of a sine wave. `harmfit`
+returns the smoothed values at each control point. The return value is also a model which can be used by
+the `predict`, `derivative` and `integrate` function. There are also three helper functions that can be used to
+retrieve the estimated parameters of the fitted model:
+
+* `getAmplitude`: Returns the amplitude of sine wave.
+* `getAngularFrequency`: Returns the angular frequency of the sine wave.
+* `getPhase`: Returns the phase of the sine wave.
+
+*Note*: The `harmfit` function works best when run on a single oscillation rather then a long sequence of
+oscillations. This is particularly true if the sine wave has noise. After the curve has been fit it can be
+extrapolated to any point in time in the past or future.
+
+In example below the `harmfit` function fits control points, provided as x and y axes and the
+angular frequency, phase and amplitude are retrieved from the fitted model.
+
+[source,text]
+----
+let(echo="freq, phase, amp",
+    x=array(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19),
+    y=array(-0.7441113653915925,-0.8997532112139415, -0.9853140681578838, -0.9941296760805463,
+            -0.9255133950087844, -0.7848096869247675, -0.5829778403072583, -0.33573836075915076,
+            -0.06234851460699166, 0.215897602691855, 0.47732764497752245, 0.701579055431586,
+             0.8711850882773975, 0.9729352782968976, 0.9989043923858761, 0.9470697190130273,
+             0.8214686154479715, 0.631884041542757, 0.39308257356494, 0.12366424851680227),
+    model=harmfit(x, y),
+    freq=getAngularFrequency(model),
+    phase=getPhase(model),
+    amp=getAmplitude(model))
+----
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "freq": 0.28,
+        "phase": 2.4100000000000006,
+        "amp": 0.9999999999999999
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 0
+      }
+    ]
+  }
+}
+----
+
+=== Interpolation and Extrapolation
+
+The `harmfit` function returns a fitted model of the sine wave that can used by the `predict` function to
+interpolate or extrapolate the sine wave.
+
+The example below uses the fitted model to extrapolate the sine wave beyond the control points
+to the x-axis points 20, 21, 22, 23.
+
+
+[source,text]
+----
+let(x=array(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19),
+    y=array(-0.7441113653915925,-0.8997532112139415, -0.9853140681578838, -0.9941296760805463,
+            -0.9255133950087844, -0.7848096869247675, -0.5829778403072583, -0.33573836075915076,
+            -0.06234851460699166, 0.215897602691855, 0.47732764497752245, 0.701579055431586,
+             0.8711850882773975, 0.9729352782968976, 0.9989043923858761, 0.9470697190130273,
+             0.8214686154479715, 0.631884041542757, 0.39308257356494, 0.12366424851680227),
+    model=harmfit(x, y),
+    extrapolation=predict(model, array(20, 21, 22, 23)))
+----
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "extrapolation": [
+          -0.1553861764415666,
+          -0.42233370833176975,
+          -0.656386037906838,
+          -0.8393130343914845
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 0
+      }
+    ]
+  }
+}
+----
+
 
 == Gaussian Curve Fitting
 

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/dsp.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/dsp.adoc b/solr/solr-ref-guide/src/dsp.adoc
index 60a6594..04524dc 100644
--- a/solr/solr-ref-guide/src/dsp.adoc
+++ b/solr/solr-ref-guide/src/dsp.adoc
@@ -525,6 +525,46 @@ When this expression is sent to the `/stream` handler it responds with:
 }
 ----
 
+== Oscillate (Sine Wave)
+
+The `oscillate` function generates a periodic oscillating signal based
+on a parameters. The `oscillate` function can be used to study,
+combine and model sine waves.
+
+The `oscillate` function takes three parameters: amplitude, angular frequency
+and phase and returns a vector contain the y axis points of sine wave.
+
+The y axis points were generated from a sequence 0-127.
+
+Below is an example of the `oscillate` function called with an amplitude of
+1, and angular frequency of .28 and phase of 1.57.
+
+[source,text]
+----
+oscillate(1, 0.28, 1.57)
+----
+
+The result of the `oscillate` function is plotted below:
+
+image::images/math-expressions/sinewave.png[]
+
+=== Sine Wave Interpolation, Extrapolation
+
+The `oscillate` function returns a function which can be used by the `predict` function to interpolate or extrapolate a sine wave.
+The example below extrapolates the sine wave to a sequence from 0-256.
+
+
+[source,text]
+----
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)))
+----
+
+The extrapolated sine wave is plotted below:
+
+image::images/math-expressions/sinewave256.png[]
+
+
 == Autocorrelation
 
 Autocorrelation measures the degree to which a signal is correlated with itself. Autocorrelation is used to determine
@@ -532,15 +572,16 @@ if a vector contains a signal or is purely random.
 
 A few examples, with plots, will help to understand the concepts.
 
-In the first example the `sin` function is wrapped around a `sequence` function to generate a sine wave. The result of this
+The first example simply revisits the example above of an extrapolated sine wave. The result of this
 is plotted in the image below. Notice that there is a structure to the plot that is clearly not random.
 
 [source,text]
 ----
-sin(sequence(256, 0, 6))
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)))
 ----
 
-image::images/math-expressions/signal.png[]
+image::images/math-expressions/sinewave256.png[]
 
 
 In the next example the `sample` function is used to draw 256 samples from a `uniformDistribution` to create a
@@ -562,9 +603,10 @@ becomes more dense it can become harder to see a pattern hidden within noise.
 
 [source,text]
 ----
-let(a=sin(sequence(256, 0, 6)),
-    b=sample(uniformDistribution(-1.5, 1.5), 256),
-    c=ebeAdd(a, b))
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)),
+    c=sample(uniformDistribution(-1.5, 1.5), 256),
+    d=ebeAdd(b,c))
 ----
 
 image::images/math-expressions/hidden-signal.png[]
@@ -585,8 +627,9 @@ This is the autocorrelation plot of a pure signal.
 
 [source,text]
 ----
-let(a=sin(sequence(256, 0, 6)),
-    b=conv(a, rev(a)),
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)),
+    c=conv(b, rev(b)))
 ----
 
 image::images/math-expressions/signal-autocorrelation.png[]
@@ -615,10 +658,11 @@ strongly that there is an underlying signal hidden within the noise.
 
 [source,text]
 ----
-let(a=sin(sequence(256, 0, 6)),
-    b=sample(uniformDistribution(-1.5, 1.5), 256),
-    c=ebeAdd(a, b),
-    d=conv(c, rev(c))
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)),
+    c=sample(uniformDistribution(-1.5, 1.5), 256),
+    d=ebeAdd(b, c),
+    e=conv(d, rev(d)))
 ----
 
 image::images/math-expressions/hidden-signal-autocorrelation.png[]
@@ -675,9 +719,10 @@ associated with them. This `fft` shows a clear signal with very low levels of no
 
 [source,text]
 ----
-let(a=sin(sequence(256, 0, 6)),
-    b=fft(a),
-    c=rowAt(b, 0))
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)),
+    c=fft(b),
+    d=rowAt(c, 0))
 ----
 
 
@@ -709,11 +754,12 @@ shows that there is considerable noise along with the signal.
 
 [source,text]
 ----
-let(a=sin(sequence(256, 0, 6)),
-    b=sample(uniformDistribution(-1.5, 1.5), 256),
-    c=ebeAdd(a, b),
-    d=fft(c),
-    e=rowAt(d, 0))
+let(a=oscillate(1, 0.28, 1.57),
+    b=predict(a, sequence(256, 0, 1)),
+    c=sample(uniformDistribution(-1.5, 1.5), 256),
+    d=ebeAdd(b, c),
+    e=fft(d),
+    f=rowAt(e, 0))
 ----
 
 image::images/math-expressions/hidden-signal-fft.png[]

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/images/math-expressions/sinewave.png
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/images/math-expressions/sinewave.png b/solr/solr-ref-guide/src/images/math-expressions/sinewave.png
new file mode 100644
index 0000000..19d9b93
Binary files /dev/null and b/solr/solr-ref-guide/src/images/math-expressions/sinewave.png differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/images/math-expressions/sinewave256.png
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/images/math-expressions/sinewave256.png b/solr/solr-ref-guide/src/images/math-expressions/sinewave256.png
new file mode 100644
index 0000000..e821057
Binary files /dev/null and b/solr/solr-ref-guide/src/images/math-expressions/sinewave256.png differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/math-expressions.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/math-expressions.adoc b/solr/solr-ref-guide/src/math-expressions.adoc
index 3974989..595a7b1 100644
--- a/solr/solr-ref-guide/src/math-expressions.adoc
+++ b/solr/solr-ref-guide/src/math-expressions.adoc
@@ -1,5 +1,5 @@
 = Math Expressions
-:page-children: scalar-math, vector-math, variables, matrix-math, vectorization, term-vectors, statistics, probability-distributions, simulations, time-series, regression, numerical-analysis, curve-fitting, dsp, machine-learning
+:page-children: scalar-math, vector-math, variables, matrix-math, vectorization, term-vectors, statistics, probability-distributions, simulations, time-series, regression, numerical-analysis, curve-fitting, dsp, machine-learning, computational-geometry
 
 // Licensed to the Apache Software Foundation (ASF) under one
 // or more contributor license agreements.  See the NOTICE file
@@ -44,18 +44,21 @@ record in your Solr Cloud cluster computable.
 
 *<<statistics.adoc#statistics,Statistics>>*: Statistical functions in math expressions.
 
-*<<probability-distributions.adoc#probability-distributions,Probability>>*: Mathematical models for probability.
+*<<probability-distributions.adoc#probability-distributions,Probability>>*: Mathematical models of probability.
 
-*<<simulations.adoc#simulations,Monte Carlo Simulations>>*: Performing correlated and uncorrelated Monte Carlo simulations.
+*<<simulations.adoc#simulations,Monte Carlo Simulations>>*: Performing uncorrelated and correlated Monte Carlo simulations.
 
 *<<regression.adoc#regression,Linear Regression>>*: Simple and multivariate linear regression.
 
 *<<numerical-analysis.adoc#numerical-analysis,Interpolation, Derivatives and Integrals>>*: Numerical analysis math expressions.
 
-*<<curve-fitting.adoc#curve-fitting,Curve Fitting>>*: Polynomial and Gaussian curve fitting.
-
 *<<dsp.adoc#dsp,Digital Signal Processing>>*: Functions commonly used with digital signal processing.
 
-*<<time-series.adoc#time-series,Time Series>>*: Aggregation, smoothing, differencing, modeling and anomaly detection for time series.
+*<<curve-fitting.adoc#curve-fitting,Curve Fitting>>*: Polynomial, Harmonic and Gaussian curve fitting.
+
+*<<time-series.adoc#time-series,Time Series>>*: Aggregation, smoothing and differencing of time series.
 
 *<<machine-learning.adoc#machine-learning,Machine Learning>>*: Functions used in machine learning.
+
+*<<computational-geometry.adoc#computational-geometry,Computational Geometry>>*: Convex Hulls and Enclosing Disks.
+

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/matrix-math.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/matrix-math.adoc b/solr/solr-ref-guide/src/matrix-math.adoc
index b73f01a..36191c4 100644
--- a/solr/solr-ref-guide/src/matrix-math.adoc
+++ b/solr/solr-ref-guide/src/matrix-math.adoc
@@ -103,6 +103,59 @@ responds with:
 }
 ----
 
+== Pair sorting vectors
+
+The `pairSort` function can be used to sort two vectors based on the values in
+the first vector. The sorting operation maintains the pairing between
+the two vectors during the sort.
+
+The `pairSort` function returns a matrix containing the
+pair sorted vectors. The first row in the matrix is the first vector,
+the second row in the matrix is the second vector.
+
+The individual vectors can then be accessed using the `rowAt` function.
+
+The example below performs a pair sort on two vectors and returns the
+matrix containing the sorted vectors.
+
+----
+let(a=array(10, 2, 1),
+    b=array(100, 200, 300),
+    c=pairSort(a, b))
+----
+
+When this expression is sent to the `/stream` handler it
+responds with:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "c": [
+          [
+            1,
+            2,
+            10
+          ],
+          [
+            300,
+            200,
+            100
+          ]
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 1
+      }
+    ]
+  }
+}
+----
+
+
 == Row and Column Labels
 
 A matrix can have column and rows and labels. The functions

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/statistics.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/statistics.adoc b/solr/solr-ref-guide/src/statistics.adoc
index af908d3..2d7a548 100644
--- a/solr/solr-ref-guide/src/statistics.adoc
+++ b/solr/solr-ref-guide/src/statistics.adoc
@@ -537,6 +537,8 @@ array.
 
 * `log`: Returns a numeric array with the natural log of each element of the original array.
 
+* `log10`: Returns a numeric array with the base 10 log of each element of the original array.
+
 * `sqrt`: Returns a numeric array with the square root of each element of the original array.
 
 * `cbrt`: Returns a numeric array with the cube root of each element of the original array.
@@ -574,6 +576,91 @@ When this expression is sent to the `/stream` handler it responds with:
 }
 ----
 
+== Back Transformations
+
+Vectors that have been transformed with `log`, `log10`, `sqrt` and `cbrt` functions
+can be back transformed using the `pow` function.
+
+The example below shows how to back transform data that has been transformed by the
+`sqrt` function.
+
+
+[source,text]
+----
+let(echo="b,c",
+    a=array(100, 200, 300),
+    b=sqrt(a),
+    c=pow(b, 2))
+----
+
+When this expression is sent to the `/stream` handler it responds with:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "b": [
+          10,
+          14.142135623730951,
+          17.320508075688775
+        ],
+        "c": [
+          100,
+          200.00000000000003,
+          300.00000000000006
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 0
+      }
+    ]
+  }
+}
+----
+
+The example below shows how to back transform data that has been transformed by the
+`log10` function.
+
+
+[source,text]
+----
+let(echo="b,c",
+    a=array(100, 200, 300),
+    b=log10(a),
+    c=pow(10, b))
+----
+
+When this expression is sent to the `/stream` handler it responds with:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "b": [
+          2,
+          2.3010299956639813,
+          2.4771212547196626
+        ],
+        "c": [
+          100,
+          200.00000000000003,
+          300.0000000000001
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 0
+      }
+    ]
+  }
+}
+----
+
 == Z-scores
 
 The `zscores` function converts a numeric array to an array of z-scores. The z-score

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/stream-decorator-reference.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/stream-decorator-reference.adoc b/solr/solr-ref-guide/src/stream-decorator-reference.adoc
index 5549fc2..aa8c55d 100644
--- a/solr/solr-ref-guide/src/stream-decorator-reference.adoc
+++ b/solr/solr-ref-guide/src/stream-decorator-reference.adoc
@@ -735,6 +735,29 @@ leftOuterJoin(
 )
 ----
 
+[#list_expression]
+== list
+
+The `list` function wraps N Stream Expressions and opens and iterates each stream sequentially.
+This has the effect of concatenating the results of multiple Streaming Expressions.
+
+=== list Parameters
+
+* StreamExpressions ...: N Streaming Expressions
+
+=== list Syntax
+
+[source,text]
+----
+list(tuple(a="hello world"), tuple(a="HELLO WORLD"))
+
+list(search(collection1, q="*:*", fl="id, prod_ss", sort="id asc"),
+     search(collection2, q="*:*", fl="id, prod_ss", sort="id asc"))
+
+list(tuple(a=search(collection1, q="*:*", fl="id, prod_ss", sort="id asc")),
+     tuple(a=search(collection2, q="*:*", fl="id, prod_ss", sort="id asc")))
+----
+
 == hashJoin
 
 The `hashJoin` function wraps two streams, Left and Right, and for every tuple in Left which exists in Right will emit a tuple containing the fields of both tuples. This supports one-to-one, one-to-many, many-to-one, and many-to-many inner join scenarios. The tuples are emitted in the order in which they appear in the Left stream. The order of the streams does not matter. If both tuples contain a field of the same name then the value from the Right stream will be used in the emitted tuple.
@@ -1028,6 +1051,31 @@ The following is a `solrconfig.xml` snippet for 2 workers and "year_i" as the `p
 ----
 ====
 
+== plist
+
+The `plist` function wraps N Stream Expressions and opens the streams in parallel
+and iterates each stream sequentially. The difference between the `list` and `plist` is that
+the streams are opened in parallel. Since many streams such as
+`facet`, `stats` and `significantTerms` push down heavy operations to Solr when they are opened,
+the plist function can dramatically improve performance by doing these operations in parallel.
+
+=== plist Parameters
+
+* StreamExpressions ...: N Streaming Expressions
+
+=== plist Syntax
+
+[source,text]
+----
+plist(tuple(a="hello world"), tuple(a="HELLO WORLD"))
+
+plist(search(collection1, q="*:*", fl="id, prod_ss", sort="id asc"),
+      search(collection2, q="*:*", fl="id, prod_ss", sort="id asc"))
+
+plist(tuple(a=search(collection1, q="*:*", fl="id, prod_ss", sort="id asc")),
+      tuple(a=search(collection2, q="*:*", fl="id, prod_ss", sort="id asc")))
+----
+
 == priority
 
 The `priority` function is a simple priority scheduler for the <<executor>> function. The `executor` function doesn't directly have a concept of task prioritization; instead it simply executes tasks in the order that they are read from it's underlying stream. The `priority` function provides the ability to schedule a higher priority task ahead of lower priority tasks that were submitted earlier.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/3b4c0eb5/solr/solr-ref-guide/src/vector-math.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/vector-math.adoc b/solr/solr-ref-guide/src/vector-math.adoc
index e06008d..d610d4e 100644
--- a/solr/solr-ref-guide/src/vector-math.adoc
+++ b/solr/solr-ref-guide/src/vector-math.adoc
@@ -143,6 +143,42 @@ When this expression is sent to the `/stream` handler it responds with:
 }
 ----
 
+== Vector Sorting
+
+An array can be sorted in natural ascending order with `asc` function.
+
+[source,text]
+----
+asc(array(10,1,2,3,4,5,6))
+----
+
+When this expression is sent to the `/stream` handler it responds with:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "return-value": [
+          1,
+          2,
+          3,
+          4,
+          5,
+          6,
+          10
+        ]
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 1
+      }
+    ]
+  }
+}
+----
+
 == Vector Summarizations and Norms
 
 There are a set of functions that perform summarizations and return norms of arrays. These functions

[3/3] lucene-solr:branch_7x: SOLR-12913: RefGuide formatting

Posted by jb...@apache.org.

SOLR-12913: RefGuide formatting


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/7259d782
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/7259d782
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/7259d782

Branch: refs/heads/branch_7x
Commit: 7259d782fb39e6e82b28ac8073dbc03272b8819a
Parents: 4399bcd
Author: Joel Bernstein <jb...@apache.org>
Authored: Fri Oct 26 13:41:09 2018 -0400
Committer: Joel Bernstein <jb...@apache.org>
Committed: Sun Oct 28 13:15:12 2018 -0400

----------------------------------------------------------------------
 solr/solr-ref-guide/src/dsp.adoc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/7259d782/solr/solr-ref-guide/src/dsp.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/dsp.adoc b/solr/solr-ref-guide/src/dsp.adoc
index 8f3b24b..e9d4fd6 100644
--- a/solr/solr-ref-guide/src/dsp.adoc
+++ b/solr/solr-ref-guide/src/dsp.adoc
@@ -28,6 +28,7 @@ the more advanced DSP functions its useful to develop a deeper intuition of the
 The dot product operation is performed in two steps:
 
 1) Element-by-element multiplication of two vectors which produces a vector of products.
+
 2) Sum the vector of products to produce a scalar result.
 
 This simple bit of math has a number of important applications.
@@ -146,7 +147,8 @@ When this expression is sent to the `/stream` handler it responds with:
 ----
 
 In the example above two arrays were combined in a way that produced the mean of the first. In the second array
-each value was set to ".2". Another way of looking at this is that each value in the second array has the same weight.
+each value was set to .2. Another way of looking at this is that each value in the second array is
+applying the same weight to the values in the first array.
 By varying the weights in the second array we can produce a different result.
 For example if the first array represents a time series,
 the weights in the second array can be set to add more weight to a particular element in the first array.

[2/3] lucene-solr:branch_7x: SOLR-12913: RefGuide revisions

Posted by jb...@apache.org.

SOLR-12913: RefGuide revisions


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/4399bcd0
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/4399bcd0
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/4399bcd0

Branch: refs/heads/branch_7x
Commit: 4399bcd0abc30f9a803275b4e8f9a77734b62508
Parents: 3b4c0eb
Author: Joel Bernstein <jb...@apache.org>
Authored: Fri Oct 26 13:31:22 2018 -0400
Committer: Joel Bernstein <jb...@apache.org>
Committed: Sun Oct 28 13:14:55 2018 -0400

----------------------------------------------------------------------
 .../src/computational-geometry.adoc             | 12 ++---
 solr/solr-ref-guide/src/curve-fitting.adoc      |  9 ++--
 solr/solr-ref-guide/src/dsp.adoc                | 48 ++++++++++++++------
 solr/solr-ref-guide/src/matrix-math.adoc        |  4 +-
 solr/solr-ref-guide/src/statistics.adoc         |  2 +-
 .../src/stream-decorator-reference.adoc         | 46 +++++++++----------
 solr/solr-ref-guide/src/vector-math.adoc        |  4 +-
 7 files changed, 75 insertions(+), 50 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/computational-geometry.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/computational-geometry.adoc b/solr/solr-ref-guide/src/computational-geometry.adoc
index 263f6d0..abcdb08 100644
--- a/solr/solr-ref-guide/src/computational-geometry.adoc
+++ b/solr/solr-ref-guide/src/computational-geometry.adoc
@@ -24,14 +24,14 @@ functions.
 
 A convex hull is the smallest convex set of points that encloses a data set. Math expressions has support for computing
 the convex hull of a 2D data set. Once a convex hull has been calculated, a set of math expression functions
-can be used to geometrically describe the convex hull.
+can be applied to geometrically describe the convex hull.
 
 The `convexHull` function finds the convex hull of an observation matrix of 2D vectors.
-Each row of the matrix contains a 2D observation.
+Each row of the matrix is a 2D observation.
 
 In the example below a convex hull is calculated for a randomly generated set of 100 2D observations.
 
-Then the following functions are called on the convex result:
+Then the following functions are called on the convex hull:
 
 -`getBaryCenter`: Returns the 2D point that is the bary center of the convex hull.
 
@@ -39,7 +39,7 @@ Then the following functions are called on the convex result:
 
 -`getBoundarySize`: Returns the boundary size of the convex hull.
 
--`getVertices`: Returns 2D points that are the vertices of the convex hull.
+-`getVertices`: Returns a set of 2D points that are the vertices of the convex hull.
 
 
 [source,text]
@@ -126,11 +126,11 @@ When this expression is sent to the `/stream` handler it responds with:
 
 The `enclosingDisk` function finds the smallest enclosing circle the encloses a 2D data set.
 Once an enclosing disk has been calculated, a set of math expression functions
-can be used to geometrically describe the enclosing disk.
+can be applied to geometrically describe the enclosing disk.
 
 In the example below an enclosing disk is calculated for a randomly generated set of 1000 2D observations.
 
-Then the following functions are called on the enclosing disk result:
+Then the following functions are called on the enclosing disk:
 
 -`getCenter`: Returns the 2D point that is the center of the disk.
 

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/curve-fitting.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/curve-fitting.adoc b/solr/solr-ref-guide/src/curve-fitting.adoc
index dc66eeb..c2086b1 100644
--- a/solr/solr-ref-guide/src/curve-fitting.adoc
+++ b/solr/solr-ref-guide/src/curve-fitting.adoc
@@ -230,10 +230,11 @@ for the x-axis.
 
 The example below shows `harmfit` fitting a single oscillation of a sine wave. `harmfit`
 returns the smoothed values at each control point. The return value is also a model which can be used by
-the `predict`, `derivative` and `integrate` function. There are also three helper functions that can be used to
-retrieve the estimated parameters of the fitted model:
+the `predict`, `derivative` and `integrate` functions.
 
-* `getAmplitude`: Returns the amplitude of sine wave.
+There are also three helper functions that can be used to retrieve the estimated parameters of the fitted model:
+
+* `getAmplitude`: Returns the amplitude of the sine wave.
 * `getAngularFrequency`: Returns the angular frequency of the sine wave.
 * `getPhase`: Returns the phase of the sine wave.
 
@@ -241,7 +242,7 @@ retrieve the estimated parameters of the fitted model:
 oscillations. This is particularly true if the sine wave has noise. After the curve has been fit it can be
 extrapolated to any point in time in the past or future.
 
-In example below the `harmfit` function fits control points, provided as x and y axes and the
+In the example below the `harmfit` function fits control points, provided as x and y axes, and then the
 angular frequency, phase and amplitude are retrieved from the fitted model.
 
 [source,text]

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/dsp.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/dsp.adoc b/solr/solr-ref-guide/src/dsp.adoc
index 04524dc..8f3b24b 100644
--- a/solr/solr-ref-guide/src/dsp.adoc
+++ b/solr/solr-ref-guide/src/dsp.adoc
@@ -21,19 +21,43 @@ Digital Signal Processing (DSP).
 
 == Dot Product
 
-The `dotProduct` function is used to calculate the dot product of two arrays.
+The `dotProduct` function is used to calculate the dot product of two numeric arrays.
 The dot product is a fundamental calculation for the DSP functions discussed in this section. Before diving into
-the more advanced DSP functions, its useful to get a better understanding of how the dot product calculation works.
+the more advanced DSP functions its useful to develop a deeper intuition of the dot product.
 
-=== Combining Two Arrays
+The dot product operation is performed in two steps:
 
-The `dotProduct` function can be used to combine two arrays into a single product. A simple example can help
-illustrate this concept.
+1) Element-by-element multiplication of two vectors which produces a vector of products.
+2) Sum the vector of products to produce a scalar result.
+
+This simple bit of math has a number of important applications.
+
+=== Representing Linear Combinations
+
+The `dotProduct` performs the math of a Linear Combination. A Linear Combination has the following form:
+
+[source,text]
+----
+(a1*v1)+(a2*v2)...
+----
+
+In the above example a1 and a2 are random variables that change. v1 and v2 are *constant values*.
+
+When computing the dot product the elements of two vectors are multiplied together and the results are added.
+If the first vector contains random variables and the second vector contains *constant values*
+then the dot product is performing a linear combination.
+
+This scenario comes up again and again in machine learning. For example both linear and logistic regression
+solve for a vector of constant weights. In order to perform a prediction, a dot product is calculated
+between a random observation vector and the constant weight vector. That dot product is a linear combination because
+one of the vectors holds constant weights.
+
+Lets look at simple example of how a linear combination can be used to find the *mean* of a vector of numbers.
 
 In the example below two arrays are set to variables *`a`* and *`b`* and then operated on by the `dotProduct` function.
 The output of the `dotProduct` function is set to variable *`c`*.
 
-Then the `mean` function is then used to compute the mean of the first array which is set to the variable *`d`*.
+The `mean` function is then used to compute the mean of the first array which is set to the variable *`d`*.
 
 Both the dot product and the mean are included in the output.
 
@@ -527,14 +551,12 @@ When this expression is sent to the `/stream` handler it responds with:
 
 == Oscillate (Sine Wave)
 
-The `oscillate` function generates a periodic oscillating signal based
-on a parameters. The `oscillate` function can be used to study,
-combine and model sine waves.
+The `oscillate` function generates a periodic oscillating signal which can be used to model and study sine waves.
 
-The `oscillate` function takes three parameters: amplitude, angular frequency
-and phase and returns a vector contain the y axis points of sine wave.
+The `oscillate` function takes three parameters: *amplitude*, *angular frequency*
+and *phase* and returns a vector containing the y-axis points of a sine wave.
 
-The y axis points were generated from a sequence 0-127.
+The y-axis points were generated from an x-axis sequence of 0-127.
 
 Below is an example of the `oscillate` function called with an amplitude of
 1, and angular frequency of .28 and phase of 1.57.
@@ -551,7 +573,7 @@ image::images/math-expressions/sinewave.png[]
 === Sine Wave Interpolation, Extrapolation
 
 The `oscillate` function returns a function which can be used by the `predict` function to interpolate or extrapolate a sine wave.
-The example below extrapolates the sine wave to a sequence from 0-256.
+The example below extrapolates the sine wave to an x-axis sequence of 0-256.
 
 
 [source,text]

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/matrix-math.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/matrix-math.adoc b/solr/solr-ref-guide/src/matrix-math.adoc
index 36191c4..b5cce75 100644
--- a/solr/solr-ref-guide/src/matrix-math.adoc
+++ b/solr/solr-ref-guide/src/matrix-math.adoc
@@ -103,7 +103,7 @@ responds with:
 }
 ----
 
-== Pair sorting vectors
+== Pair Sorting Vectors
 
 The `pairSort` function can be used to sort two vectors based on the values in
 the first vector. The sorting operation maintains the pairing between
@@ -115,7 +115,7 @@ the second row in the matrix is the second vector.
 
 The individual vectors can then be accessed using the `rowAt` function.
 
-The example below performs a pair sort on two vectors and returns the
+The example below performs a pair sort of two vectors and returns the
 matrix containing the sorted vectors.
 
 ----

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/statistics.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/statistics.adoc b/solr/solr-ref-guide/src/statistics.adoc
index 2d7a548..324a8a6 100644
--- a/solr/solr-ref-guide/src/statistics.adoc
+++ b/solr/solr-ref-guide/src/statistics.adoc
@@ -578,7 +578,7 @@ When this expression is sent to the `/stream` handler it responds with:
 
 == Back Transformations
 
-Vectors that have been transformed with `log`, `log10`, `sqrt` and `cbrt` functions
+Vectors that have been transformed with the `log`, `log10`, `sqrt` and `cbrt` functions
 can be back transformed using the `pow` function.
 
 The example below shows how to back transform data that has been transformed by the

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/stream-decorator-reference.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/stream-decorator-reference.adoc b/solr/solr-ref-guide/src/stream-decorator-reference.adoc
index aa8c55d..d9186c9 100644
--- a/solr/solr-ref-guide/src/stream-decorator-reference.adoc
+++ b/solr/solr-ref-guide/src/stream-decorator-reference.adoc
@@ -735,29 +735,6 @@ leftOuterJoin(
 )
 ----
 
-[#list_expression]
-== list
-
-The `list` function wraps N Stream Expressions and opens and iterates each stream sequentially.
-This has the effect of concatenating the results of multiple Streaming Expressions.
-
-=== list Parameters
-
-* StreamExpressions ...: N Streaming Expressions
-
-=== list Syntax
-
-[source,text]
-----
-list(tuple(a="hello world"), tuple(a="HELLO WORLD"))
-
-list(search(collection1, q="*:*", fl="id, prod_ss", sort="id asc"),
-     search(collection2, q="*:*", fl="id, prod_ss", sort="id asc"))
-
-list(tuple(a=search(collection1, q="*:*", fl="id, prod_ss", sort="id asc")),
-     tuple(a=search(collection2, q="*:*", fl="id, prod_ss", sort="id asc")))
-----
-
 == hashJoin
 
 The `hashJoin` function wraps two streams, Left and Right, and for every tuple in Left which exists in Right will emit a tuple containing the fields of both tuples. This supports one-to-one, one-to-many, many-to-one, and many-to-many inner join scenarios. The tuples are emitted in the order in which they appear in the Left stream. The order of the streams does not matter. If both tuples contain a field of the same name then the value from the Right stream will be used in the emitted tuple.
@@ -863,6 +840,29 @@ intersect(
 )
 ----
 
+[#list_expression]
+== list
+
+The `list` function wraps N Stream Expressions and opens and iterates each stream sequentially.
+This has the effect of concatenating the results of multiple Streaming Expressions.
+
+=== list Parameters
+
+* StreamExpressions ...: N Streaming Expressions
+
+=== list Syntax
+
+[source,text]
+----
+list(tuple(a="hello world"), tuple(a="HELLO WORLD"))
+
+list(search(collection1, q="*:*", fl="id, prod_ss", sort="id asc"),
+     search(collection2, q="*:*", fl="id, prod_ss", sort="id asc"))
+
+list(tuple(a=search(collection1, q="*:*", fl="id, prod_ss", sort="id asc")),
+     tuple(a=search(collection2, q="*:*", fl="id, prod_ss", sort="id asc")))
+----
+
 == merge
 
 The `merge` function merges two or more streaming expressions and maintains the ordering of the underlying streams. Because the order is maintained, the sorts of the underlying streams must line up with the on parameter provided to the merge function.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4399bcd0/solr/solr-ref-guide/src/vector-math.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/vector-math.adoc b/solr/solr-ref-guide/src/vector-math.adoc
index d610d4e..6171d77 100644
--- a/solr/solr-ref-guide/src/vector-math.adoc
+++ b/solr/solr-ref-guide/src/vector-math.adoc
@@ -145,7 +145,9 @@ When this expression is sent to the `/stream` handler it responds with:
 
 == Vector Sorting
 
-An array can be sorted in natural ascending order with `asc` function.
+An array can be sorted in natural ascending order with the `asc` function.
+
+The example below shows the `asc` function sorting an array:
 
 [source,text]
 ----