You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by ct...@apache.org on 2021/01/12 22:44:58 UTC

[lucene-solr] branch jira/solr-13105-toMerge updated: More consistency fixes; change "x axis" to "x-axis", etc.

This is an automated email from the ASF dual-hosted git repository.

ctargett pushed a commit to branch jira/solr-13105-toMerge
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git


The following commit(s) were added to refs/heads/jira/solr-13105-toMerge by this push:
     new 251b611  More consistency fixes; change "x axis" to "x-axis", etc.
251b611 is described below

commit 251b611c9753a68bcd70ef750714441684ab6273
Author: Cassandra Targett <ct...@apache.org>
AuthorDate: Tue Jan 12 16:44:13 2021 -0600

    More consistency fixes; change "x axis" to "x-axis", etc.
---
 .../solr-ref-guide/src/computational-geometry.adoc |   4 +-
 solr/solr-ref-guide/src/curve-fitting.adoc         |  12 +-
 solr/solr-ref-guide/src/machine-learning.adoc      |   8 +-
 solr/solr-ref-guide/src/numerical-analysis.adoc    |  54 +++---
 .../src/probability-distributions.adoc             |  12 +-
 solr/solr-ref-guide/src/regression.adoc            |  67 ++++----
 solr/solr-ref-guide/src/scalar-math.adoc           |   2 +-
 solr/solr-ref-guide/src/search-sample.adoc         |  10 +-
 solr/solr-ref-guide/src/simulations.adoc           | 186 +++++++++------------
 solr/solr-ref-guide/src/statistics.adoc            |   4 +-
 solr/solr-ref-guide/src/time-series.adoc           |  72 ++++----
 11 files changed, 197 insertions(+), 234 deletions(-)

diff --git a/solr/solr-ref-guide/src/computational-geometry.adoc b/solr/solr-ref-guide/src/computational-geometry.adoc
index 83139f9..4853a99 100644
--- a/solr/solr-ref-guide/src/computational-geometry.adoc
+++ b/solr/solr-ref-guide/src/computational-geometry.adoc
@@ -42,8 +42,8 @@ Before visualizing the convex hull its often useful to visualize the 2D points a
 
 In this example the `random` function draws a sample of records from the nyc311 (complaints database) collection where
 the complaint description matches "rat sighting" and the zip code is 11238. The latitude and longitude fields
-are then vectorized and plotted as a scatter plot with longitude on *x* axis and latitude on the
-*y* axis.
+are then vectorized and plotted as a scatter plot with longitude on x-axis and latitude on the
+y-axis.
 
 image::images/math-expressions/convex0.png[]
 
diff --git a/solr/solr-ref-guide/src/curve-fitting.adoc b/solr/solr-ref-guide/src/curve-fitting.adoc
index 31009a0..b5613b0 100644
--- a/solr/solr-ref-guide/src/curve-fitting.adoc
+++ b/solr/solr-ref-guide/src/curve-fitting.adoc
@@ -65,7 +65,7 @@ The residuals can be calculated and visualized in the same manner as linear
 regression as well. In the example below the `ebeSubtract` function is used
 to subtract the fitted model from the observed values, to
 calculate a vector of residuals. The residuals are then plotted in a *residual plot*
-with the predictions along the *x* axis and the model error on the *y* axis.
+with the predictions along the x-axis and the model error on the y-axis.
 
 image::images/math-expressions/polyfit-resid.png[]
 
@@ -73,8 +73,8 @@ image::images/math-expressions/polyfit-resid.png[]
 == Gaussian Curve Fitting
 
 The `gaussfit` function fits a smooth curve through a Gaussian peak. The `gaussfit`
-function takes an *x* and *y* axis and fits a smooth gaussian curve to the data. If
-only one vector of numbers is passed, `gaussfit` will treat it as the *y* axis
+function takes an *x* and y-axis and fits a smooth gaussian curve to the data. If
+only one vector of numbers is passed, `gaussfit` will treat it as the y-axis
 and will generate a sequence for the *x* access.
 
 One of the interesting use cases for `gaussfit` is to visualize how well a regression
@@ -93,8 +93,8 @@ observations in the
 bin. If the residuals are normally distributed we would expect the bin counts
 to roughly follow a gaussian curve.
 
-The bin count vector is then passed to `gaussfit` as the *y* axis. `gaussfit` generates
-a sequence for the *x* axis and then fits the gaussian curve to data.
+The bin count vector is then passed to `gaussfit` as the y-axis. `gaussfit` generates
+a sequence for the x-axis and then fits the gaussian curve to data.
 
 `zplot` is then used to plot the original bin counts and the fitted curve. In the
 example below, the blue line is the bin counts, and the smooth yellow line is the
@@ -130,7 +130,7 @@ image::images/math-expressions/harmfit.png[]
 
 
 The output of `harmfit` is a model that can be used by the `predict` function to interpolate and extrapolate
-the sine wave. In the example below the `natural` function creates an *x* axis from 0 to 127
+the sine wave. In the example below the `natural` function creates an x-axis from 0 to 127
 used to predict results for the model. This extrapolates the sine wave out to 128 points, when
 the original model curve had only 19 control points.
 
diff --git a/solr/solr-ref-guide/src/machine-learning.adoc b/solr/solr-ref-guide/src/machine-learning.adoc
index 58801f6..876491e 100644
--- a/solr/solr-ref-guide/src/machine-learning.adoc
+++ b/solr/solr-ref-guide/src/machine-learning.adoc
@@ -268,8 +268,8 @@ for the visualization of the *residual plot*.
 
 image::images/math-expressions/redwine1.png[]
 
-The residual plot plots the *predicted* values on the *x* axis and the *error* for the
-prediction on the *y* axis. The scatter plot shows how the errors
+The residual plot plots the *predicted* values on the x-axis and the *error* for the
+prediction on the y-axis. The scatter plot shows how the errors
 are distributed across the full range of predictions.
 
 The residual plot can be interpreted to understand how the KNN regression performed on the
@@ -441,8 +441,8 @@ is used to visualize the clusters as a scatter plot.
 image::images/math-expressions/2DCluster1.png[]
 
 The scatter plot above shows each lat/lon point plotted on a Euclidean plain with longitude on the
-*x* axis and
-latitude on the *y* axis. The plot is dense enough so the outlines of the different boroughs are visible
+x-axis and
+latitude on the y-axis. The plot is dense enough so the outlines of the different boroughs are visible
 if you know the boroughs of New York City.
 
 
diff --git a/solr/solr-ref-guide/src/numerical-analysis.adoc b/solr/solr-ref-guide/src/numerical-analysis.adoc
index 16ec68e..9b1195e 100644
--- a/solr/solr-ref-guide/src/numerical-analysis.adoc
+++ b/solr/solr-ref-guide/src/numerical-analysis.adoc
@@ -47,21 +47,21 @@ control points. Then a spline is used to interpolate the smoothed control points
 === Sampling Along the Curve
 
 One way to better understand interpolation is to visualize what it means to sample along a curve. The example
-below zooms in on a specific region of a curve by sampling the curve between a specific *x* axis range.
+below zooms in on a specific region of a curve by sampling the curve between a specific x-axis range.
 
 image::images/math-expressions/interpolate1.png[]
 
-The visualization above first creates two arrays with *x* and *y* axis points. Notice that the *x* axis ranges from
+The visualization above first creates two arrays with x and y-axis points. Notice that the x-axis ranges from
  0 to 9. Then the `akima`, `spline` and `lerp`
 functions are applied to the vectors to create three interpolation functions.
 
 Then 500 hundred random samples are drawn from a uniform distribution between 0 and 3. These are
-the new zoomed in *x* axis points, between 0 and 3. Notice that we are sampling a specific
+the new zoomed in x-axis points, between 0 and 3. Notice that we are sampling a specific
 area of the curve.
 
-Then the `predict` function is used to predict *y* axis points for
-the sampled *x* axis, for all three interpolation functions. Finally all three prediction vectors
-are plotted with the sampled *x* axis points.
+Then the `predict` function is used to predict y-axis points for
+the sampled x-axis, for all three interpolation functions. Finally all three prediction vectors
+are plotted with the sampled x-axis points.
 
 The red line is the `lerp` interpolation, the blue line is the `akima` and the purple line is
 the `spline` interpolation. You can see they each produce different curves in between the control
@@ -78,21 +78,21 @@ A technique known as local regression is used to compute the smoothed curve. The
 neighborhood of the local regression can be adjusted
 to control how close the new curve conforms to the original control points.
 
-The `loess` function is passed *`x`*- and *`y`*-axes and fits a smooth curve to the data.
-If only a single array is provided it is treated as the *`y`*-axis and a sequence is generated
-for the *`x`*-axis.
+The `loess` function is passed x- and y-axes and fits a smooth curve to the data.
+If only a single array is provided it is treated as the y-axis and a sequence is generated
+for the x-axis.
 
 The example below shows the `loess` function being used to model a monthly
 time series. In the example the `timeseries` function is used to generate
 a monthly time series of average closing prices for the stock ticker
-*amzn*. The *date_dt* and *avg(close_d)* fields from the time series
-are then vectorized and stored in variables *x* and *y*. The `loess`
+*AMZN*. The `date_dt` and `avg(close_d)` fields from the time series
+are then vectorized and stored in variables `x` and `y`. The `loess`
 function is then applied to the *y* vector containing the average closing
-prices. The *bandwidth* named parameter specifies the percentage
+prices. The `bandwidth` named parameter specifies the percentage
 of the data set used to compute the local regression. The `loess` function
 returns the fitted model of smoothed data points.
 
-The `zplot` function is then used to plot the *x*, *y* and *y1*
+The `zplot` function is then used to plot the `x`, `y` and `y1`
 variables.
 
 image::images/math-expressions/loess.png[]
@@ -100,8 +100,8 @@ image::images/math-expressions/loess.png[]
 
 == Derivatives
 
-The derivative of a function measures the rate of change of the *y* value in respects to the
-rate of change of the *x* value.
+The derivative of a function measures the rate of change of the `y` value in respects to the
+rate of change of the `x` value.
 
 The `derivative` function can compute the derivative for any of the
 interpolation functions described above. Each interpolation function
@@ -115,11 +115,11 @@ the rate of change or *velocity*.
 
 In the example two vectors are created, one representing hours and
 one representing miles traveled. The `lerp` function is then used to
-create a linear interpolation of the *hours* and *miles* vectors.
+create a linear interpolation of the `hours` and `miles` vectors.
 The `derivative` function is then applied to the
-linear interpolation. `zplot` is then used to plot the *hours*
-on the *x* axis and *miles* on the *y* axis, and the *derivative* as *mph*,
-at each *x* axis point.
+linear interpolation. `zplot` is then used to plot the *`hours`*
+on the x-axis and `miles` on the y-axis, and the `derivative` as `mph`,
+at each x-axis point.
 
 
 image::images/math-expressions/derivative.png[]
@@ -140,11 +140,11 @@ a derivative with rounded curves.
 === The Second Derivative (Acceleration)
 
 While the first derivative represents velocity, the second derivative
-represents *acceleration*. The second the derivative is the derivative
+represents `acceleration`. The second the derivative is the derivative
 of the first derivative.
 
 The example below builds on the first example and adds the second derivative.
-Notice that the second derivative *d2*, is taken by applying the
+Notice that the second derivative `d2` is taken by applying the
 derivative function to a linear interpolation of the first derivative.
 
 The second derivative is plotted as *acceleration* on the chart.
@@ -160,7 +160,7 @@ line drops to 0.
 The example below shows how to plot the `derivative` for a time series generated
 by the `timeseries` function. In the example a monthly time series is
 generated for the average closing price for the stock ticker `amzn`.
-The *avg(close)* column is vectorized and interpolated using linear
+The `avg(close)` column is vectorized and interpolated using linear
 interpolation (`lerp`).  The `zplot` function is then used to plot the derivative
 of the time series.
 
@@ -217,8 +217,8 @@ responds with:
 
 If the `integral` function is passed a single interpolated curve it returns a vector of the cumulative
 integrals for the curve. The cumulative integrals vector contains a cumulative integral calculation
-for each *x* axis point. The cumulative integral is calculated by taking the
-integral of the range between each *x* axis point and the *first* *x* axis point. In the example above this would
+for each x-axis point. The cumulative integral is calculated by taking the
+integral of the range between each x-axis point and the *first* x-axis point. In the example above this would
 mean calculating a vector of integrals as such:
 
 [source,text]
@@ -230,19 +230,19 @@ let(x=array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
 ----
 
 The plot of cumulative integrals visualizes how much cumulative volume of the curve is under each point
-*x* axis point.
+x-axis point.
 
 The example below shows the cumulative integral plot for a time series generated by
 the `timeseries` function. In the example a monthly time series is
 generated for the average closing price for the stock ticker `amzn`.
-The *avg(close)* column is vectorized and interpolated using a `spline`.
+The `avg(close)` column is vectorized and interpolated using a `spline`.
 
 The `zplot` function is then used to plot the cumulative integral
 of the time series.
 
 image::images/math-expressions/integral.png[]
 
-The plot above *visualizes* the volume under the curve as the *amzn* stock
+The plot above visualizes the volume under the curve as the *AMZN* stock
 price changes over time.  Because this plot is cumulative, the volume under
 a stock price time series which stays the same over time, will
 have a positive *linear* slope. A stock that has rising prices will have a *concave* shape and
diff --git a/solr/solr-ref-guide/src/probability-distributions.adoc b/solr/solr-ref-guide/src/probability-distributions.adoc
index 472d043..85e4247 100644
--- a/solr/solr-ref-guide/src/probability-distributions.adoc
+++ b/solr/solr-ref-guide/src/probability-distributions.adoc
@@ -37,17 +37,15 @@ are the supported continuous probability distributions.
 The `empiricalDistribution` function creates a continuous probability
 distribution from actual data.
 
-Empirical distributions can be used to conveniently visualize the probability density
-function of a random sample from a SolrCloud
-collection. The example below shows the zplot function visualizing the probability
+Empirical distributions can be used to conveniently visualize the probability density function of a random sample from a SolrCloud collection.
+The example below shows the `zplot` function visualizing the probability
 density of a random sample with a 32 bin histogram.
 
 image::images/math-expressions/empirical.png[]
 
 ==== normalDistribution
 
-The visualization below shows a normal distribution with a `mean` of 0 and `standard
-deviation` of 1.
+The visualization below shows a normal distribution with a `mean` of 0 and `standard deviation` of 1.
 
 image::images/math-expressions/dist.png[]
 
@@ -361,7 +359,7 @@ In this example 5000 random samples are selected from a collection of log record
 the fields `filesize_d` and `response_d`. The values of both fields conform to a normal distribution.
 
 Both fields are then vectorized. The `filesize_d` vector is stored in
-variable *`b`* and the `response_d` variable is stored in variable *`c`*.
+variable `b` and the `response_d` variable is stored in variable `c`.
 
 An array is created that contains the means of the two vectorized fields.
 
@@ -373,7 +371,7 @@ the observation matrix with the `cov` function. The covariance matrix describes
 
 The `multivariateNormalDistribution` function is then called with the
 array of means for the two fields and the covariance matrix. The model for the
-multivariate normal distribution is assigned to variable *`g`*.
+multivariate normal distribution is assigned to variable `g`.
 
 Finally five samples are drawn from the multivariate normal distribution.
 
diff --git a/solr/solr-ref-guide/src/regression.adoc b/solr/solr-ref-guide/src/regression.adoc
index fb29e18..4e3dcad 100644
--- a/solr/solr-ref-guide/src/regression.adoc
+++ b/solr/solr-ref-guide/src/regression.adoc
@@ -27,7 +27,7 @@ the second array is the dependent variable.
 
 In the example below the `random` function selects 50000 random samples each containing
 the fields `filesize_d` and `response_d`. The two fields are vectorized
-and stored in variables *`x`* and *`y`*. Then the `regress` function performs a regression
+and stored in variables `x` and `y`. Then the `regress` function performs a regression
 analysis on the two numeric arrays.
 
 The `regress` function returns a single tuple with the results of the regression
@@ -131,7 +131,7 @@ let(a=random(logs, q="*:*", rows="5000", fl="filesize_d, response_d"),
 
 When this expression is sent to the `/stream` handler it responds with:
 
-[source,text]
+[source,json]
 ----
 {
   "result-set": {
@@ -148,8 +148,7 @@ When this expression is sent to the `/stream` handler it responds with:
           699.5597256337142,
           742.4738911248204,
           769.0342605881644,
-          746.6740473150268,
-          ...
+          746.6740473150268
           ]
       },
       {
@@ -163,9 +162,10 @@ When this expression is sent to the `/stream` handler it responds with:
 
 === Regression Plot
 
-Using *zplot* and the Zeppelin-Solr interpreter we can visualize both the observations and the predictions in
-the same scatter plot. In the example below zplot is plotting the *filesize_d* observations on the
-*x* axis, the *response_d* observations on the *y* access and the predictions on the *y1* access.
+Using `zplot` and the Zeppelin-Solr interpreter we can visualize both the observations and the predictions in
+the same scatter plot.
+In the example below `zplot` is plotting the `filesize_d` observations on the
+x-axis, the `response_d` observations on the y-axis and the predictions on the y1-axis.
 
 image::images/math-expressions/linear.png[]
 
@@ -175,9 +175,9 @@ The difference between the observed value and the predicted value is known as th
 residual. There isn't a specific function to calculate the residuals but vector
 math can used to perform the calculation.
 
-In the example below the predictions are stored in variable *`p`*. The `ebeSubtract`
+In the example below the predictions are stored in variable `p`. The `ebeSubtract`
 function is then used to subtract the predictions
-from the actual `response_d` values stored in variable *`y`*. Variable *`e`* contains
+from the actual `response_d` values stored in variable `y`. Variable `e` contains
 the array of residuals.
 
 [source,text]
@@ -192,7 +192,7 @@ let(a=random(logs, q="*:*", rows="500", fl="filesize_d, response_d"),
 
 When this expression is sent to the `/stream` handler it responds with:
 
-[source,text]
+[source,json]
 ----
 {
   "result-set": {
@@ -211,8 +211,7 @@ When this expression is sent to the `/stream` handler it responds with:
           -30.213178859683012,
           -30.609943619066826,
           10.527700442607625,
-          10.68046928406568,
-          ...
+          10.68046928406568
           ]
       },
       {
@@ -226,28 +225,24 @@ When this expression is sent to the `/stream` handler it responds with:
 
 === Residual Plot
 
-Using *zplot* and Zeppelin-Solr we can visualize the residuals with
+Using `zplot` and Zeppelin-Solr we can visualize the residuals with
 a residuals plot. The example residual plot below plots the predicted value on the
-*x* axis and the error of the prediction on the *y* access.
+x-axis and the error of the prediction on the y-axis.
 
 image::images/math-expressions/residual-plot.png[]
 
-The residual plot can be used to interpret reliability of the model. Three things
-to look for are:
+The residual plot can be used to interpret reliability of the model. Three things to look for are:
 
-1) Do the residuals appear to be normally distributed with a mean of 0. This makes
-it easier to interpret the results of the model to determine if the distribution
-of the errors is acceptable for predictions. It also makes it easier to use a model
-of the residuals for anomaly detection on new predictions.
+. Do the residuals appear to be normally distributed with a mean of 0?
+This makes it easier to interpret the results of the model to determine if the distribution of the errors is acceptable for predictions.
+It also makes it easier to use a model of the residuals for anomaly detection on new predictions.
 
-2) Do the residuals appear to be *heteroscedastic*. Which means is the variance
-of the residuals the same across the range of predictions. By plotting the prediction
-on the *x* axis and error on *y* access we can see if the variability stays the same
-as the predictions get higher. If the residuals are heteroscedastic it means
-that we can trust the models error to be consistent across the range of predictions.
+. Do the residuals appear to be *heteroscedastic*?
+Which means is the variance of the residuals the same across the range of predictions?
+By plotting the prediction on the x-axis and error on y-axis we can see if the variability stays the same as the predictions get higher.
+If the residuals are heteroscedastic it means that we can trust the models error to be consistent across the range of predictions.
 
-3) Is there any pattern to the residuals? If so there is likely still a signal within the
-data that needs to be modeled.
+. Is there any pattern to the residuals? If so there is likely still a signal within the data that needs to be modeled.
 
 
 == Multivariate Linear Regression
@@ -259,7 +254,7 @@ The example below extends the simple linear regression example by introducing a
 called `load_d`. The `load_d` variable is the load on the network while the file is being downloaded.
 
 Notice that the two independent variables `filesize_d` and `load_d` are vectorized and stored
-in the variables *`b`* and *`c`*. The variables *`b`* and *`c`* are then added as rows to a `matrix`. The matrix is
+in the variables `b` and `c`. The variables `b` and `c` are then added as rows to a `matrix`. The matrix is
 then transposed so that each row in the matrix represents one observation with `filesize_d` and `service_d`.
 The `olsRegress` function then performs the multivariate regression analysis using the observation matrix as the
 independent variables and the `response_d` values, stored in variable *`d`*, as the dependent variable.
@@ -274,7 +269,7 @@ let(a=random(testapp, q="*:*", rows="30000", fl="filesize_d, load_d, response_d"
     r=olsRegress(m, z))
 ----
 
-Notice in the response that the RSquared of the regression analysis is 1. This means that linear relationship between
+Notice in the response that the `RSquared` of the regression analysis is `1`. This means that linear relationship between
 `filesize_d` and `service_d` describe 100% of the variability of the `response_d` variable:
 
 [source,json]
@@ -381,7 +376,7 @@ let(a=random(logs, q="*:*", rows="5000", fl="filesize_d, load_d, response_d"),
 
 When this expression is sent to the `/stream` handler it responds with:
 
-[source,text]
+[source,json]
 ----
 {
   "result-set": {
@@ -399,8 +394,7 @@ When this expression is sent to the `/stream` handler it responds with:
           841.5253327135974,
           896.9648275225625,
           858.6511235977382,
-          869.8381475112501,
-          ...
+          869.8381475112501
           ]
       },
       {
@@ -418,7 +412,7 @@ Once the predictions are generated the residuals can be calculated using the sam
 simple linear regression.
 
 Below is an example of the residuals calculation following a multivariate linear regression. In the example
-the predictions stored variable *`g`* are subtracted from observed values stored in variable *`d`*.
+the predictions stored variable `g` are subtracted from observed values stored in variable `d`.
 
 [source,text]
 ----
@@ -434,7 +428,7 @@ let(a=random(logs, q="*:*", rows="5000", fl="filesize_d, load_d, response_d"),
 
 When this expression is sent to the `/stream` handler it responds with:
 
-[source,text]
+[source,json]
 ----
 {
   "result-set": {
@@ -451,8 +445,7 @@ When this expression is sent to the `/stream` handler it responds with:
           -4.276176642246014,
           10.781062392156628,
           0.00039750380267378205,
-          -1.8307638852961645,
-          ...
+          -1.8307638852961645
           ]
       },
       {
@@ -467,7 +460,7 @@ When this expression is sent to the `/stream` handler it responds with:
 === Residual Plot
 
 The residual plot for multi-variate linear regression is the same as for simple linear regression.
-The predictions are plotted on the *x* axis and the error is plotted on the *y* axis.
+The predictions are plotted on the x-axis and the error is plotted on the y-axis.
 
 image::images/math-expressions/residual-plot2.png[]
 
diff --git a/solr/solr-ref-guide/src/scalar-math.adoc b/solr/solr-ref-guide/src/scalar-math.adoc
index 66f38ca..4db6a38 100644
--- a/solr/solr-ref-guide/src/scalar-math.adoc
+++ b/solr/solr-ref-guide/src/scalar-math.adoc
@@ -143,7 +143,7 @@ The expression above can be visualized as a table using Zeppelin-Solr.
 
 image::images/math-expressions/stream.png[]
 
-By switching to one of the line chart visualizations the two variables can be plotted on the x and y axis.
+By switching to one of the line chart visualizations the two variables can be plotted on the x and y-axis.
 
 image::images/math-expressions/line.png[]
 
diff --git a/solr/solr-ref-guide/src/search-sample.adoc b/solr/solr-ref-guide/src/search-sample.adoc
index 8bf5a1e..2fd97bd 100644
--- a/solr/solr-ref-guide/src/search-sample.adoc
+++ b/solr/solr-ref-guide/src/search-sample.adoc
@@ -50,7 +50,7 @@ We have also limited the result set to three specific fields.
 image::images/math-expressions/search-sort.png[]
 
 Once the data is loaded into the table we can switch to a scatter plot and plot the `filesize_d` column
-on the *x axis* and the `response_d` column on the *y axis*.
+on the *x-axis* and the `response_d` column on the *y-axis*.
 
 image::images/math-expressions/search-sort-plot.png[]
 
@@ -84,11 +84,11 @@ and <<machine-learning.adoc,Machine Learning>> sections.
 In the example below the `random` function is called in its simplest form with just a collection name as the parameter.
 
 When called with no other parameters the `random` function returns a random sample of 500 records with all fields from the collection.
-When called without the field list parameter (`fl`) the `random` function also generates a sequence, 0-499 in this case, which can be used for plotting the `x` axis.
+When called without the field list parameter (`fl`) the `random` function also generates a sequence, 0-499 in this case, which can be used for plotting the x-axis.
 This sequence is returned in a field called `x`.
 
 The visualization below shows a scatter plot with the `filesize_d` field
-plotted on the `y` axis and the `x` sequence plotted on the `x` axis.
+plotted on the y-axis and the `x` sequence plotted on the x-axis.
 The effect of this is to spread the `filesize_d` samples across the length
 of the plot so they can be more easily studied.
 
@@ -114,7 +114,7 @@ The field list (`fl`) now specifies two fields to be
 returned with each sample: `filesize_d` and `response_d`.
 The `q` and `rows` parameters are the same as the defaults but are included as an example of how to set these parameters.
 
-By plotting `filesize_d` on the *x* axis and `response_d` on the *y* axis we can begin to study the relationship between the two variables.
+By plotting `filesize_d` on the x-axis and `response_d` on the y-axis we can begin to study the relationship between the two variables.
 
 By studying the scatter plot we can learn the following:
 
@@ -246,7 +246,7 @@ text field containing movie reviews. The result shows the
 significant terms that appear in movie reviews that have the phrase "sci-fi".
 
 The results are visualized using a bubble chart with the *foreground* count on
-plotted on the *x* axis and the *background* count on the *y* axis. Each term is
+plotted on the x-axis and the *background* count on the y-axis. Each term is
 shown in a bubble sized by the *score*.
 
 image::images/math-expressions/sterms.png[]
diff --git a/solr/solr-ref-guide/src/simulations.adoc b/solr/solr-ref-guide/src/simulations.adoc
index 7d14f3d..349aac6 100644
--- a/solr/solr-ref-guide/src/simulations.adoc
+++ b/solr/solr-ref-guide/src/simulations.adoc
@@ -31,7 +31,7 @@ A useful first step in understanding the difference is to visualize
 daily stock returns, calculated as closing price minus opening price, as a time series.
 
 The example below uses the `search` function to return 1000 days of daily stock
-returns for the ticker *cvx* (Chevron). The *change_d* field, which is the
+returns for the ticker *CVX* (Chevron). The `change_d` field, which is the
 change in price for the day, is then plotted as a time series.
 
 image::images/math-expressions/randomwalk1.png[]
@@ -48,104 +48,90 @@ Autocorrelation measures the degree to which a signal is correlated with itself.
 if a vector contains a signal or if there is dependency between values in a time series. If there is no
 signal and no dependency between values in the time series then the time series is random.
 
-It's useful to plot the autocorrelation of the *change_d* vector to confirm that it is indeed random.
+It's useful to plot the autocorrelation of the `change_d` vector to confirm that it is indeed random.
 
-In the example below the search results are set to a variable and then the *change_d* field
-is vectorized and stored in variable *b*. Then the
- `conv` (convolution) function is used to autocorrelate
-the *change_d* vector.
-Notice that the `conv` function is simply "convolving" the *change_d* vector
+In the example below the search results are set to a variable and then the `change_d` field is vectorized and stored in variable `b`.
+Then the `conv` (convolution) function is used to autocorrelate
+the `change_d` vector.
+Notice that the `conv` function is simply "convolving" the `change_d` vector
 with a reversed copy of itself.
 This is the technique for performing autocorrelation using convolution.
 The <<dsp.adoc#dsp,Signal Processing>> section
 of the user guide covers both convolution and autocorrelation in detail.
 In this section we'll just discuss the plot.
 
-The plot shows the intensity of correlation that is calculated as the *change_d* vector is slid across
-itself by the `conv` function.
-Notice in the plot there is long period of low intensity correlation that appears
-to be random. Then in the center a peak of high intensity correlation where the vectors
+The plot shows the intensity of correlation that is calculated as the `change_d` vector is slid across itself by the `conv` function.
+Notice in the plot there is long period of low intensity correlation that appears to be random.
+Then in the center a peak of high intensity correlation where the vectors
 are directly lined up.
 This is followed by another long period of low intensity correlation.
 
-This is the autocorrelation plot of pure noise. The daily stock changes appear
-to be a random time series.
+This is the autocorrelation plot of pure noise.
+The daily stock changes appear to be a random time series.
 
 image::images/math-expressions/randomwalk2.png[]
 
 === Visualizing the Distribution
 
 The random daily changes in stock prices cannot be predicted, but they can be modeled with a probability distribution.
-To model the time series we'll start by visualizing the distribution of the *change_d* vector. In the example
-below the *change_d* vector is plotted using the `empiricalDistribution` function to create an 11 bin
-histogram of the data. Notice that the distribution appears to be normally distributed. Daily stock price
-changes do tend to be normally distributed although *cvx* was chosen specifically
-for this example because of this characteristic.
+To model the time series we'll start by visualizing the distribution of the `change_d` vector.
+In the example below the `change_d` vector is plotted using the `empiricalDistribution` function to create an 11 bin
+histogram of the data.
+Notice that the distribution appears to be normally distributed.
+Daily stock price changes do tend to be normally distributed although *CVX* was chosen specifically for this example because of this characteristic.
 
 image::images/math-expressions/randomwalk3.png[]
 
 
 === Fitting the Distribution
 
-The `ks` Test can be used to determine if the distribution of a vector of data fits a
-reference distribution.
-In the example below the `ks` test is performed with a *normal distribution* with the *mean*
-and *standard deviation* of the *change_d* vector as the reference distribution. The `ks` test is
-checking the reference distribution against the *change_d* vector itself to see if it
-fits a normal distribution.
+The `ks` test can be used to determine if the distribution of a vector of data fits a reference distribution.
+In the example below the `ks` test is performed with a *normal distribution* with the *mean* (`mean`) and *standard deviation* (`stddev`) of the `change_d` vector as the reference distribution.
+The `ks` test is checking the reference distribution against the `change_d` vector itself to see if it fits a normal distribution.
 
-Notice in the example below the `ks` test reports a p-value of .16278. A p-value of .05 or less is typically
-used to invalidate the null hypothesis of the test which is that the vector could have been
-drawn from the reference distribution.
+Notice in the example below the `ks` test reports a p-value of .16278.
+A p-value of .05 or less is typically used to invalidate the null hypothesis of the test which is that the vector could have been drawn from the reference distribution.
 
 image::images/math-expressions/randomwalk4.png[]
 
 
-The `ks` test, which tends to be fairly sensitive, has confirmed the visualization which appeared to be normal. Because of this the
-normal distribution with the *mean* and *standard deviation* of the *change_d* vector will be used to represent the daily stock returns
-for Chevron in the Monte Carlo simulations below.
+The `ks` test, which tends to be fairly sensitive, has confirmed the visualization which appeared to be normal.
+Because of this the normal distribution with the *mean* and *standard deviation* of the `change_d` vector will be used to represent the daily stock returns for Chevron in the Monte Carlo simulations below.
 
 === Monte Carlo
 
-Now that we have fit a distribution to the daily stock return data we can use the
-`monteCarlo` function to run a simulation using the distribution.
+Now that we have fit a distribution to the daily stock return data we can use the `monteCarlo` function to run a simulation using the distribution.
 
-The `monteCarlo` function runs a specified number of times. On each run it sets
-a series of variables and runs one final function which returns a single numeric value. The
-monteCarlo function collects the results of each run in a vector and returns it.
-The final function typically has one or more variables that are drawn from probability
-distributions on each run. The `sample` function is used to draw the samples.
+The `monteCarlo` function runs a specified number of times.
+On each run it sets a series of variables and runs one final function which returns a single numeric value.
+The `monteCarlo` function collects the results of each run in a vector and returns it.
+The final function typically has one or more variables that are drawn from probability distributions on each run.
+The `sample` function is used to draw the samples.
 
-The simulation's result array can then be treated as an empirical distribution to understand
-the probabilities of the simulation results.
+The simulation's result array can then be treated as an empirical distribution to understand the probabilities of the simulation results.
 
-The example below uses the `monteCarlo` function to simulate a distribution for the total return
-of 100 days of stock returns.
+The example below uses the `monteCarlo` function to simulate a distribution for the total return of 100 days of stock returns.
 
-In the example a `normalDistribution` is created from the *mean* and *standard deviation*
-of the *change_d* vector. The `monteCarlo` function then draws 100 samples from the
-normal distribution to represent 100 days of stock returns and sets
-the vector of samples to the variable *d*.
+In the example a `normalDistribution` is created from the *mean* and *standard deviation* of the `change_d` vector.
+The `monteCarlo` function then draws 100 samples from the normal distribution to represent 100 days of stock returns and sets the vector of samples to the variable `d`.
 
 The `add` function then calculates the total return
-from the 100 day sample. The output of the `add` function is collected by the
-`monteCarlo` function. This is repeated
-50000 times, with each run drawing a different set of samples from
-the normal distribution.
+from the 100 day sample.
+The output of the `add` function is collected by the `monteCarlo` function.
+This is repeated 50000 times, with each run drawing a different set of samples from the normal distribution.
 
-The result of the simulation is set to variable *s*, which contains
+The result of the simulation is set to variable `s`, which contains
 the total returns from the 50000 runs.
 
-The `empiricalDistribution` function is then used to visualize the output of the simulation
-as a 50 bin histogram. The distribution visualizes the probability of the different total
-returns from 100 days of stock returns for ticker *cvx*.
+The `empiricalDistribution` function is then used to visualize the output of the simulation as a 50 bin histogram.
+The distribution visualizes the probability of the different total
+returns from 100 days of stock returns for ticker *CVX*.
 
 image::images/math-expressions/randomwalk5.png[]
 
 The `probability` and `cumulativeProbability` functions can then used to
 learn more about the `empiricalDistribution`.
-For example the `probability` function can be used to
-calculate the probability of a non-negative return from 100 days of stock returns.
+For example the `probability` function can be used to calculate the probability of a non-negative return from 100 days of stock returns.
 
 The example below uses the `probability` function to return the probability of a
 return between the range of 0 and 40 from the `empiricalDistribution`
@@ -157,15 +143,13 @@ image::images/math-expressions/randomwalk5.1.png[]
 
 The `monteCarlo` function can also be used to model a random walk of
 daily stock prices from the `normalDistribution` of daily stock returns.
-A random walk is a time series where each step is calculated by adding a random sample to the previous
-step. This creates a time series where each value is dependent on the previous value,
-which simulates the autocorrelation of stock prices.
+A random walk is a time series where each step is calculated by adding a random sample to the previous step.
+This creates a time series where each value is dependent on the previous value, which simulates the autocorrelation of stock prices.
 
-In the example below the random walk is achieved by adding a random sample to the
-variable *v* on each Monte Carlo iteration. The variable `v` is maintained between
-iterations so each iteration uses the previous value of `v`. The `double` function
-is the final function run each iteration, which simply returns the value of `v` as a
-double. The example iterates 1000 times to create a random walk with 1000 steps.
+In the example below the random walk is achieved by adding a random sample to the variable `v` on each Monte Carlo iteration.
+The variable `v` is maintained between iterations so each iteration uses the previous value of `v`.
+The `double` function is the final function run each iteration, which simply returns the value of `v` as a double.
+The example iterates 1000 times to create a random walk with 1000 steps.
 
 image::images/math-expressions/randomwalk6.png[]
 
@@ -176,32 +160,28 @@ random daily change in stock price.
 == Multivariate Normal Distribution
 
 The `multiVariateNormalDistribution` function can be used to model and simulate
-two or more normally distributed variables. It also incorporates the
-*correlation* between variables into the model which allows for the study of
-how correlation effects the possible outcomes.
+two or more normally distributed variables.
+It also incorporates the *correlation* between variables into the model which allows for the study of how correlation effects the possible outcomes.
 
 In the examples below a simulation of the total daily returns of two
-stocks is explored. The *all* ticker (*Allstate*) is used along with the
-*cvx* ticker (*Chevron*) from the previous examples.
+stocks is explored.
+The *ALL* ticker (*Allstate*) is used along with the *CVX* ticker (*Chevron*) from the previous examples.
 
 === Correlation and Covariance
 
 The multivariate simulations show the effect of correlation on possible
-outcomes. Before getting started with actual simulations its useful
-to first understand the correlation and covariance between
-the Allstate and Chevron stock returns.
+outcomes.
+Before getting started with actual simulations it's useful to first understand the correlation and covariance between the Allstate and Chevron stock returns.
 
 The example below runs two searches to retrieve the daily stock returns
-for all Allstate and Chevron. The *change_d* vectors from both returns
-are read into variables (*all* and *cvx*) and Pearson's correlation is
-calculated for the two vectors with the `corr` function.
+for all Allstate and Chevron.
+The `change_d` vectors from both returns are read into variables (`all` and `cvx`) and Pearson's correlation is calculated for the two vectors with the `corr` function.
 
 image::images/math-expressions/corrsim1.png[]
 
-Covariance is an unscaled measure of correlation. Covariance is the measure
-used by the multivariate simulations so its useful to also compute the
-covariance for the two stock returns. The example below computes
-the covariance.
+Covariance is an unscaled measure of correlation.
+Covariance is the measure used by the multivariate simulations so it's useful to also compute the covariance for the two stock returns.
+The example below computes the covariance.
 
 image::images/math-expressions/corrsim2.png[]
 
@@ -210,17 +190,14 @@ image::images/math-expressions/corrsim2.png[]
 A covariance matrix is actually whats needed by the
 `multiVariateNormalDistribution` as it contains both the variance of the
 two stock return vectors and the covariance between the two
-vectors. The `cov` function will compute the covariance matrix for the
+vectors.
+The `cov` function will compute the covariance matrix for the
 the columns of a matrix.
 
-The example below demonstrates how
-to compute the covariance matrix by adding the `all` and `cvx` vectors
-as rows to a matrix. The matrix is then transposed with the `transpose`
-function so that the `all` vector
-is the first column and the `cvx` vector is the second column.
+The example below demonstrates how to compute the covariance matrix by adding the `all` and `cvx` vectors as rows to a matrix.
+The matrix is then transposed with the `transpose` function so that the `all` vector is the first column and the `cvx` vector is the second column.
 
-The `cov` function then computes the covariance matrix for the
-columns of the matrix and returns the result.
+The `cov` function then computes the covariance matrix for the columns of the matrix and returns the result.
 
 image::images/math-expressions/corrsim3.png[]
 
@@ -240,40 +217,35 @@ cvx [0.13106056985285258, 0.7409729840230235]
 The example below demonstrates a Monte Carlo simulation with two stock tickers using the
 `multiVariateNormalDistribution`.
 
-In the example, result sets with the *change_d* field for both stock tickers, *all* (Allstate) and *cvx*
-(Chevron),
+In the example, result sets with the `change_d` field for both stock tickers, `all` (Allstate) and `cvx` (Chevron),
 are retrieved and read into vectors.
 
 A matrix is then created from the two vectors and is transposed so
-the matrix contains two columns, one with the *all* vector and one with the *cvx* vector.
+the matrix contains two columns, one with the `all` vector and one with the `cvx` vector.
 
-Then the `multiVariateNormalDistribution` is created with two parameters. The first parameter
-is an array of *mean* values. In this case the means for the *all* vector and the *cvx* vector. The
-second parameter is the covariance matrix which was created from the 2 column matrix of the two vectors.
+Then the `multiVariateNormalDistribution` is created with two parameters. The first parameter is an array of `mean` values.
+In this case the means for the `all` vector and the `cvx` vector.
+The second parameter is the covariance matrix which was created from the 2-column matrix of the two vectors.
 
-The `monteCarlo` function then performs the simulation by drawing 100 samples from the `multiVariateNormalDistribution` on
-each iteration. Each sample set is a matrix with 100 rows and 2 columns containing stock return samples
-from the *all* and *cvx* distributions. The distributions of the columns will match the normal
-distributions used to create the `multiVariateNormalDistribution`. The covariance of the sample columns
-will match the covariance matrix.
+The `monteCarlo` function then performs the simulation by drawing 100 samples from the `multiVariateNormalDistribution` on each iteration.
+Each sample set is a matrix with 100 rows and 2 columns containing stock return samples from the `all` and `cvx` distributions.
+The distributions of the columns will match the normal distributions used to create the `multiVariateNormalDistribution`.
+The covariance of the sample columns will match the covariance matrix.
 
-On each iteration the `grandSum` function is used to sum all the values of the sample matrix to get the total
-stock returns for both stocks.
+On each iteration the `grandSum` function is used to sum all the values of the sample matrix to get the total stock returns for both stocks.
 
-The output of the simulation is a vector which can be treated as an empirical distribution in exactly the
-same manner as the single stock ticker simulation. In this example it is plotted as a 50 bin histogram which
-visualizes the probability of the different total returns from 100 days of stock returns
-for the tickers *all* and *cvx*
+The output of the simulation is a vector which can be treated as an empirical distribution in exactly the same manner as the single stock ticker simulation.
+In this example it is plotted as a 50 bin histogram which visualizes the probability of the different total returns from 100 days of stock returns for the tickers `all` and `cvx`.
 
 
 image::images/math-expressions/mnorm.png[]
 
 === The Effect of Correlation
 
-The covariance matrix can be changed to study the effect on the simulation. The example
-below demonstrates this by providing a hard coded covariance matrix with a higher covariance
-value for the two vectors. This results is a simulated outcome distribution with a higher standard deviation
-or larger spread from the mean. This measures the degree that higher correlation produces higher volatility
+The covariance matrix can be changed to study the effect on the simulation.
+The example below demonstrates this by providing a hard coded covariance matrix with a higher covariance value for the two vectors.
+This results is a simulated outcome distribution with a higher standard deviation or larger spread from the mean.
+This measures the degree that higher correlation produces higher volatility
 in the random walk.
 
 image::images/math-expressions/mnorm2.png[]
diff --git a/solr/solr-ref-guide/src/statistics.adoc b/solr/solr-ref-guide/src/statistics.adoc
index c2c378a..b212724 100644
--- a/solr/solr-ref-guide/src/statistics.adoc
+++ b/solr/solr-ref-guide/src/statistics.adoc
@@ -362,7 +362,7 @@ in the variables *amzn* and *goog*. The `percentile` function is then used to ca
 the variable *p* is used to specify the list of percentiles that are calculated.
 
 Finally `zplot` is used to plot the percentiles sequence on the *x-axis* and the calculated
-percentile values for both distributions on the *y axis*. And a line plot is used
+percentile values for both distributions on the *y-axis*. And a line plot is used
 to visualize the QQ plot.
 
 image::images/math-expressions/quantile-plot.png[]
@@ -432,7 +432,7 @@ Finally the `zplot` function is used to plot the correlation matrix as a heat ma
 image::images/math-expressions/corrmatrix.png[]
 
 Notice in the example the correlation matrix is square with complaint types shown on both
-the *x* and *y* axises. The color of the cells in the heat map shows the
+the *x* and y-axises. The color of the cells in the heat map shows the
 intensity of the correlation between the complaint types.
 
 The heat map is interactive, so mousing over one of the cells pops up the values
diff --git a/solr/solr-ref-guide/src/time-series.adoc b/solr/solr-ref-guide/src/time-series.adoc
index 5797490..60229be 100644
--- a/solr/solr-ref-guide/src/time-series.adoc
+++ b/solr/solr-ref-guide/src/time-series.adoc
@@ -22,11 +22,11 @@ in Streaming Expressions and Math Expressions.
 == Time Series Aggregation
 
 The `timeseries` function performs fast, distributed time
-series aggregation leveraging Solr's builtin faceting and date math capabilities.
+series aggregation leveraging Solr's built-in faceting and date math capabilities.
 
-The example below performs a monthly time series aggregation over a collection of
-daily stock price data.  In this example the average monthly closing price is calculated for the stock
-ticker *amzn* between a specific date range.
+The example below performs a monthly time series aggregation over a collection of daily stock price data.
+In this example the average monthly closing price is calculated for the stock
+ticker *AMZN* between a specific date range.
 
 [source,text]
 ----
@@ -74,8 +74,8 @@ When this expression is sent to the `/stream` handler it responds with:
       {
         "date_dt": "2010-07",
         "avg(close_d)": 117.5190476190476
-      },
-      ...
+      }
+]}}
 ----
 
 Using Zeppelin-Solr this time series can be visualized using a line chart.
@@ -89,30 +89,28 @@ Before a time series can be smoothed or modeled the data will need to be vectori
 The `col` function can be used
 to copy a column of data from a list of tuples into an array.
 
-The expression below demonstrates the vectorization of the *date_dt* and *avg(close_d)* fields.
-The `zplot` function is then used to plot the months on the x axis and the average closing prices
-on the y axis.
+The expression below demonstrates the vectorization of the `date_dt` and `avg(close_d)` fields.
+The `zplot` function is then used to plot the months on the x-axis and the average closing prices on the y-axis.
 
 image::images/math-expressions/timeseries2.png[]
 
 
 == Smoothing
 
-Time series smoothing is often used to remove the noise from a time series and help
-spot the underlying trend.
+Time series smoothing is often used to remove the noise from a time series and help spot the underlying trend.
 The math expressions library has three *sliding window* approaches
-for time series smoothing. The *sliding window* approaches use a summary value
-from a sliding window of the data to calculate a new set of smoothed data points.
+for time series smoothing.
+These approaches use a summary value from a sliding window of the data to calculate a new set of smoothed data points.
 
 The three *sliding window* functions are lagging indicators, which means
 they don't start to move in the direction of the trend until the trend effects
-the summary value of the sliding window. Because of this lagging quality these smoothing
-functions are often used to confirm the direction of the trend.
+the summary value of the sliding window.
+Because of this lagging quality these smoothing functions are often used to confirm the direction of the trend.
 
 === Moving Average
 
 The `movingAvg` function computes a simple moving average over a sliding window of data.
-The example below generates a time series, vectorizes the *avg(close_d)* field and computes the
+The example below generates a time series, vectorizes the `avg(close_d)` field and computes the
 moving average with a window size of 5.
 
 The moving average function returns an array that is of shorter length
@@ -120,9 +118,9 @@ then the original vector. This is because results are generated only when a full
 is available for computing the average. With a window size of five the moving average will
 begin generating results at the 5th value. The prior values are not included in the result.
 
-The `zplot` function is then used to plot the months on the x axis, and the average close and moving
-average on the y axis. Notice that the `ltrim` function is used to trim the first 4 values from
-the x axis and the average closing prices. This is done to line up the three arrays so they start
+The `zplot` function is then used to plot the months on the x-axis, and the average close and moving
+average on the y-axis. Notice that the `ltrim` function is used to trim the first 4 values from
+the x-axis and the average closing prices. This is done to line up the three arrays so they start
 from the 5th value.
 
 image::images/math-expressions/movingavg.png[]
@@ -226,7 +224,7 @@ For this example we'll be working with daily stock prices for Amazon over a two
 period. The daily stock data will provide a larger data set to study.
 
 In the example below the `search` expression is used to return the daily closing price
-for the ticker *amzn* over a two year period.
+for the ticker *AMZN* over a two year period.
 
 image::images/math-expressions/anomaly.png[]
 
@@ -254,7 +252,7 @@ The `outliers` function takes four parameters:
 * Numeric vector
 * Low probability threshold
 * High probability threshold
-* List of results that the numeric vector was selected from.
+* List of results that the numeric vector was selected from
 
 The `outliers` function iterates the numeric vector and uses the probability
 distribution to calculate the cumulative probability of each value. If the cumulative
@@ -266,9 +264,9 @@ It also includes the cumulative probability and the value of the outlier.
 The example below shows the `outliers` function applied to the Amazon stock
 price data set. The empirical distribution of the moving mean absolute deviation is
 the first parameter. The vector containing the moving mean absolute
-deviations is the second parameter. -1 is the low and .99 is the high probability
-thresholds. -1 means that low outliers will not be considered. The final parameter
-is the original result set containing the *close_d* and *date_dt* fields.
+deviations is the second parameter. `-1` is the low and `.99` is the high probability
+thresholds. `-1` means that low outliers will not be considered. The final parameter
+is the original result set containing the `close_d` and `date_dt` fields.
 
 The output of the `outliers` function contains the results where an outlier was detected.
 In this case 5 results above the .99 probability threshold were detected.
@@ -281,7 +279,7 @@ image::images/math-expressions/outliers.png[]
 
 Math Expressions has a number of functions that can be used to
 model a time series. These functions include linear regression,
-polynomial and harmonic curve fitting, loess regression and KNN regression.
+polynomial and harmonic curve fitting, loess regression, and KNN regression.
 
 Each of these functions can model a time series and be used for
 interpolation (predicting values within the dataset) and several
@@ -297,8 +295,8 @@ monthly average closing price for Amazon over an eight year period.
 In this example the `polyfit` function returns a fitted model for the *y*
 axis, which is the average monthly closing prices, using a 4 degree polynomial.
 The degree of the polynomial determines the number of curves in the
-model. The fitted model is set to the variable *y1*. The fitted model
-is then directly plotted with `zplot` along with the original *y*
+model. The fitted model is set to the variable `y1`. The fitted model
+is then directly plotted with `zplot` along with the original `y`
 values.
 
 The visualization shows the smooth line fit through the average closing
@@ -312,17 +310,19 @@ image::images/math-expressions/timemodel.png[]
 The `polyfit` function can also be used to extrapolate a time series to forecast
 future stock prices. The example below demonstrates a 10 month forecast.
 
-In the example the `polyfit` function fits a model to the *y* access and the model
-is set to the variable *m*. Then to create a forecast 10 zeros are appended
-to the *y* axis to create new vector called y10. Then a new x axis is created using
-the `natural` function which returns a sequence of whole numbers 0 to the length of y10.
-The new x axis is stored in the variable x10.
+In the example the `polyfit` function fits a model to the y-axis and the model
+is set to the variable *`m`*.
+Then to create a forecast 10 zeros are appended
+to the y-axis to create new vector called `y10`.
+Then a new x-axis is created using
+the `natural` function which returns a sequence of whole numbers 0 to the length of `y10`.
+The new x-axis is stored in the variable `x10`.
 
-The `predict` function uses the fitted model to predict values for the new x axis stored in
-variable x10.
+The `predict` function uses the fitted model to predict values for the new x-axis stored in
+variable `x10`.
 
-The `zplot` function is then used to plot the x10 vector on the x axis and the y10 vector and extrapolated
-model on the y axis. Notice that the y10 vector drops to zero where the observed data
+The `zplot` function is then used to plot the `x10` vector on the x-axis and the `y10` vector and extrapolated
+model on the y-axis. Notice that the `y10` vector drops to zero where the observed data
 ends, but the forecast continues along the fitted curve
 of the model.