You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by dw...@apache.org on 2021/03/10 09:59:35 UTC

[lucene] 22/50: Solr Ref Guide: update 7.1 statistical function docs

This is an automated email from the ASF dual-hosted git repository.

dweiss pushed a commit to branch branch_7_1
in repository https://gitbox.apache.org/repos/asf/lucene.git

commit 59d402cd11822171fe50c0ce20040b34a163edf1
Author: Joel Bernstein <jb...@apache.org>
AuthorDate: Tue Oct 17 21:25:13 2017 -0400

    Solr Ref Guide: update 7.1 statistical function docs
---
 .../src/statistical-programming.adoc               | 77 +++++++++++++++++++++-
 .../src/stream-evaluator-reference.adoc            | 57 +++++++++++++---
 2 files changed, 122 insertions(+), 12 deletions(-)

diff --git a/solr/solr-ref-guide/src/statistical-programming.adoc b/solr/solr-ref-guide/src/statistical-programming.adoc
index 0d461be..07fba23 100644
--- a/solr/solr-ref-guide/src/statistical-programming.adoc
+++ b/solr/solr-ref-guide/src/statistical-programming.adoc
@@ -455,10 +455,81 @@ Returns the following response:
 
 == Setting Variables with let
 
-The `let` function sets variables and runs a streaming expression that references the variables. The `let` function can be used to
-write small statistical programs.
+The `let` function sets variables and returns the last variable. The output of any statistical function can be set to a variable.
 
-A variable can be set to the output of any streaming expression. Here is a very simple example:
+Below is a simple example setting three variables `a`, `b` and `correlation`.
+
+[source,text]
+----
+let(a=array(1,2,3),
+    b=array(10, 20, 30),
+    correlation=corr(a, b))
+----
+
+Here is the output:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "correlation": 1
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 0
+      }
+    ]
+  }
+}
+----
+
+All variables can be output by setting the `echo` variable to `true`.
+
+[source,text]
+----
+let(echo=true,
+    a=array(1,2,3),
+    b=array(10, 20, 30),
+    correlation=corr(a, b))
+----
+
+Here is the output:
+
+[source,json]
+----
+{
+  "result-set": {
+    "docs": [
+      {
+        "a": [
+          1,
+          2,
+          3
+        ],
+        "b": [
+          10,
+          20,
+          30
+        ],
+        "correlation": 1
+      },
+      {
+        "EOF": true,
+        "RESPONSE_TIME": 0
+      }
+    ]
+  }
+}
+----
+
+Streaming expressions can also be used inside of a `let` expression in the following ways:
+
+* A variable can be set to the output of any streaming expression.
+* A streaming expression can be executed after all variables have been set. The variables can then be referenced by the streaming expression that is executed. The `let` expression will stream the tuples that are emitted by the final streaming expression.
+
+Here is a very simple example:
 
 [source,text]
 ----
diff --git a/solr/solr-ref-guide/src/stream-evaluator-reference.adoc b/solr/solr-ref-guide/src/stream-evaluator-reference.adoc
index 392144c..691ac2f 100644
--- a/solr/solr-ref-guide/src/stream-evaluator-reference.adoc
+++ b/solr/solr-ref-guide/src/stream-evaluator-reference.adoc
@@ -659,8 +659,14 @@ ebeSubtract(numericArray, numericArray)
 
 == empiricalDistribution
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 The `empiricalDistribution` function returns https://en.wikipedia.org/wiki/Empirical_distribution_function[empirical distribution function], a continuous probability distribution function based
 on an actual data set. This function is part of the probability distribution framework and is designed to work with the `<<sample>>`, `<<kolmogorovSmirnov>>` and `<<cumulativeProbability>>` functions.
+=======
+The `empiricalDistribution` function returns a continuous probability distribution function based
+on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is
+designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 
 This function is designed to work with continuous data. To build a distribution from
 a discrete data set use the `<<enumeratedDistribution>>`.
@@ -1049,7 +1055,11 @@ The supported distribution functions are: `<<empiricalDistribution>>`, `<<normal
 
 === kolmogorovSmirnov Returns
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 A result tuple: A tuple containing the p-value and d-statistic for the test result.
+=======
+result tuple : A tuple containing the p-value and d-statistic for the test result.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 
 === kolmogorovSmirnov Syntax
 
@@ -1158,8 +1168,13 @@ if(gt(fieldA,fieldB),mod(fieldA,fieldB),mod(fieldB,fieldA)) // if fieldA > field
 
 == monteCarlo
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 The `monteCarlo` function performs a https://en.wikipedia.org/wiki/Monte_Carlo_method[Monte Carlo simulation]
 based on its parameters. The `monteCarlo` function runs another function a specified number of times and returns the results.
+=======
+The `monteCarlo` function performs a Monte Carlo simulation (https://en.wikipedia.org/wiki/Monte_Carlo_method)
+based on its parameters. The monteCarlo function runs another function a specified number of times and returns the results.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 The function being run typically has one or more variables that are drawn from probability
 distributions on each run. The `<<sample>>` function is used in the function to draw the samples.
 
@@ -1325,9 +1340,15 @@ or(fieldA,fieldB,fieldC,and(fieldD,fieldE),fieldF)
 
 == poissonDistribution
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 The `poissonDistribution` function returns a https://en.wikipedia.org/wiki/Poisson_distribution[poisson probability distribution]
 based on its parameter. This function is part of the probability distribution framework and is designed to
 work with the `<<sample>>`, `<<probability>>` and `<<cumulativeProbability>>` functions.
+=======
+The `poissonDistribution` function returns a poisson probability distribution (https://en.wikipedia.org/wiki/Poisson_distribution)
+based on its parameter. This function is part of the probability distribution framework and is designed to
+work with the `sample`, `probability` and `cumulativeProbability` functions.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 
 === poissonDistribution Parameters
 
@@ -1348,9 +1369,15 @@ The `polyFit` function performs https://en.wikipedia.org/wiki/Curve_fitting#Fitt
 
 === polyFit Parameters
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 * `numeric array`: (Optional) x values. If omitted a sequence will be created for the x values.
 * `numeric array`: y values
 * `integer`: (Optional) polynomial degree. Defaults to 3.
+=======
+* `numeric array` : (Optional) x values. If omitted a sequence will be created for the x values.
+* `numeric array` : y values
+* `integer` : (Optional) polynomial degree. Defaults to 3.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 
 === polyFit Returns
 
@@ -1359,7 +1386,8 @@ A numeric array: curve that was fit to the data points.
 === polyFit Syntax
 
 [source,text]
-polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using a the default 3 degree polynomial.
+polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using the default 3 degree polynomial.
+polyFit(yValues, 5) // This creates the xValues automatically and fits a curve through the data points using a 5 degree polynomial.
 polyFit(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial.
 
 == polyfitDerivative
@@ -1368,9 +1396,15 @@ The `polyfitDerivative` function returns the derivative of the curve created by
 
 === polyfitDerivative Parameters
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 * `numeric array`: (Optional) x values. If omitted a sequence will be created for the x values.
 * `numeric array`: y values
 * `integer`: (Optional) polynomial degree. Defaults to 3.
+=======
+* `numeric array` : (Optional) x values. If omitted a sequence will be created for the x values.
+* `numeric array` : y values
+* `integer` : (Optional) polynomial degree. Defaults to 3.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 
 === polyfitDerivative Returns
 
@@ -1380,6 +1414,7 @@ A numeric array: The curve for the derivative created by the polynomial curve fi
 
 [source,text]
 polyfitDerivative(yValues) // This creates the xValues automatically and returns the polyfit derivative
+polyfitDerivative(yValues, 5) //  This creates the xValues automatically and fits a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
 polyfitDerivative(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
 
 == pow
@@ -1439,13 +1474,17 @@ primes(100, 2000) // returns 100 primes starting from 2000
 
 == probability
 
-The `probability` function returns the probability of encountering a random variable within a discrete
-probability distribution.
+The `probability` function returns the probability of a random variable within a discrete probability distribution.
 
 === probability Parameters
 
+<<<<<<< 3196557fc49995bb3d083f25e13e09b3477a765c
 * `discrete probability distribution`: poissonDistribution | binomialDistribution | uniformDistribution | enumeratedDistribution
 * `integer`: Value of the random variable to compute the probability for.
+=======
+* `discrete probability distribution` : poissonDistribution | binomialDistribution | uniformDistribution | enumeratedDistribution
+* `integer` : Value of the random variable to compute the probability for.
+>>>>>>> Solr Ref Guide: update 7.1 statistical function docs
 
 === probability Returns
 
@@ -1454,7 +1493,7 @@ A double: the probability.
 === probability Syntax
 
 [source,text]
-probability(poissonDistribution(10), 7) // Returns the probability of encountering a random sample if 7 in a poisson distribution with a mean of 10.
+probability(poissonDistribution(10), 7) // Returns the probability of a random sample of 7 in a poisson distribution with a mean of 10.
 
 == rank
 
@@ -1493,7 +1532,7 @@ eq(raw(fieldA), fieldA) // true if the value of fieldA equals the string "fieldA
 
 == regress
 
-The `regress` function performs a simple regression on two numeric arrays.
+The `regress` function performs a simple regression of two numeric arrays.
 
 The result of this expression is also used by the `<<predict>>` and `<<residuals>>` functions.
 
@@ -1512,8 +1551,8 @@ regress(numericArray1, numericArray2)
 
 The `residuals` function takes three parameters: a simple regression model, an array of predictor values
 and an array of actual values. The residuals function applies the simple regression model to the
-array of predictor values and computes a predictions array. The actual values array is then
-subtracted from the predictions array to compute the residuals array.
+array of predictor values and computes a predictions array. The predicted values array is then
+subtracted from the actual value array to compute the residuals array.
 
 === residuals Parameters
 
@@ -1576,8 +1615,8 @@ Either a single numeric random sample, or a numeric array depending on the sampl
 === sample Syntax
 
 [source,text]
-sample(normalDistribution(50, 5)) // Return a single random sample from a normalDistribution with mean of 50 and standard deviation of 5.
-sample(poissonDistribution(5), 1000) // Return 1000 random samples from poissonDistribution with a mean of 5.
+sample(poissonDistribution(5)) // Returns a single random sample from a poissonDistribution with mean of 5.
+sample(poissonDistribution(5), 1000) // Returns 1000 random samples from poissonDistribution with a mean of 5.
 
 == scale