You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by jb...@apache.org on 2019/10/06 18:29:28 UTC

[lucene-solr] branch SOLR-13105-visual updated: SOLR-13105: Inprove geometry docs 2

This is an automated email from the ASF dual-hosted git repository.

jbernste pushed a commit to branch SOLR-13105-visual
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git


The following commit(s) were added to refs/heads/SOLR-13105-visual by this push:
     new 7d409bb  SOLR-13105: Inprove geometry docs 2
7d409bb is described below

commit 7d409bbcc80e836233ab4452df349b2c57ad152a
Author: Joel Bernstein <jb...@apache.org>
AuthorDate: Sun Oct 6 14:29:06 2019 -0400

    SOLR-13105: Inprove geometry docs 2
---
 .../solr-ref-guide/src/computational-geometry.adoc |  64 +++++++++++++++++++--
 .../src/images/math-expressions/hullplot.png       | Bin 195004 -> 191801 bytes
 solr/solr-ref-guide/src/machine-learning.adoc      |   6 +-
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/solr/solr-ref-guide/src/computational-geometry.adoc b/solr/solr-ref-guide/src/computational-geometry.adoc
index 819f82b..8018a90 100644
--- a/solr/solr-ref-guide/src/computational-geometry.adoc
+++ b/solr/solr-ref-guide/src/computational-geometry.adoc
@@ -20,35 +20,87 @@
 This section of the math expressions user guide covers computational geometry functions.
 
 <<Convex Hull, Convex Hull>> -
+<<Visualization, Visualization>> -
 <<Enclosing Disk, Enclosing Disk>>
 
-
 == Convex Hull
 
 A convex hull is the smallest convex set of points that encloses a data set. Math expressions has support for computing
 the convex hull of a 2D data set. Once a convex hull has been calculated, a set of math expression functions
-can be applied to geometrically describe the convex hull.
+can be applied to geometrically describe and visualize the convex hull.
 
+=== Visualization
 
+The `convexHull` function can be used to visualize a border around a
+set of 2D points. Border visualizations can be useful for understanding where data points are
+in relation to the border. In the examples below the `convexHull` is used
+to visualize a boarder for a set of latitude and longitude points of rat sightings in the nyc311
+complaints database. An investigation of the boarder around the rat sightings can be done
+to better understand how rats may be entering or exiting the specific region.
 
-=== Visualization
+==== Scatter Plot
 
-The `convexHull` function can be used visualize a border around a set of 2D
+Before visualizing the convex hull its often useful to visualize the 2D points as a scatter plot.
 
+In this example the `random` function draws a sample of records from the nyc311 (complaints database) collection where
+the complaint description matches "rat sighting" and the zip code is 11238. The latitude and longitude fields
+are then vectorized and plotted as a scatter plot with longitude on *x* axis and latitude on the
+*y* axis.
 
 image::images/math-expressions/convex0.png[]
 
+Notice from the scatter plot that many of the points appear to lie near the border of the plot.
+
+==== Convex Hull Plot
+
+The `convexHull` function cam be used to visualize the boarder. The example uses the same points
+drawn from the nyc311 database. But instead of plotting the points directly the latitude and
+longitude points are added as rows to a matrix. The matrix is then transposed with `transpose`
+function so that each row of the matrix contains a single latitude and longitude point.
+
+The `convexHull` function is then used calculate the convex hull for the matrix of points. The
+convex hull is set a variable called *hull*
+
+Once the convex hull has been created the `getVertices` function can be used to
+retrieve the matrix of points in the scatter plot that comprise the convex border around the scatter plot.
+The `colAt` function can then be used to retrieve the latitude and longitude vectors from the matrix
+so they can visualized by the `zplot` function. In the example below the convex hull points are
+visualized as a scatter plot.
+
 image::images/math-expressions/hullplot.png[]
 
+Notice that the 15 points in the scatter plot describe that latitude and longitude points of the
+convex hull.
 
-image::images/math-expressions/convex1.png[]
+==== Projecting and Clustering
 
-image::images/math-expressions/convex2.png[]
+The once a convex hull as been calculated the `projectToBorder` can then be used to project
+points to the nearest point on the boarder. In the example below the `projectToBorder` function
+is used to project the original scatter scatter plot points to the nearest border.
 
+The `projectToBorder` function returns a matrix of lat, lon points for the border projections. In
+the example the matrix of border points is then clustered into 7 clusters using kmeans clustering.
+The `zplot` function is then used to plot the clustered border points.
 
+image::images/math-expressions/convex1.png[]
 
+Notice in the visualization its easy to see which spots along the border have the highest
+density of points. In the case or the rat sightings this information is useful in understanding
+which boarder points are closest for the rats to enter or exit from.
 
+==== Plotting the Centroids
+
+Once the boarder points have been clustered its very easy to extract the centroids of the clusters
+and plot them on a map. The example below extracts the centroids from the clusters using the
+`getCentroids` function. `getCentroids` returns the matrix of lat, lon points which represent
+the centroids of border clusters. The `colAt` function can then be used to extract the lat, lon
+vectors so they can be plotted on a map using `zplot`.
+
+image::images/math-expressions/convex2.png[]
 
+The map above shows the centroids of the border clusters. The centroids from the highest
+density clusters can now be zoomed and investigated geo-spatially to determine what might be
+the best places to begin an investigation of the border.
 
 == Enclosing Disk
 
diff --git a/solr/solr-ref-guide/src/images/math-expressions/hullplot.png b/solr/solr-ref-guide/src/images/math-expressions/hullplot.png
index 7f9413c..8e51e80 100644
Binary files a/solr/solr-ref-guide/src/images/math-expressions/hullplot.png and b/solr/solr-ref-guide/src/images/math-expressions/hullplot.png differ
diff --git a/solr/solr-ref-guide/src/machine-learning.adoc b/solr/solr-ref-guide/src/machine-learning.adoc
index a83b7ed..21c5ead 100644
--- a/solr/solr-ref-guide/src/machine-learning.adoc
+++ b/solr/solr-ref-guide/src/machine-learning.adoc
@@ -584,7 +584,9 @@ for examining the clusters and centroids.
 === 2D Cluster Visualization
 
 The `zplot` function has direct support for plotting 2D clusters by using the *clusters* named parameter.
-The example demonstrates this capability by clustering and visualizing latitude and longitude points.
+The example below demonstrates this capability by clustering and visualizing latitude and longitude points.
+
+==== Clustered Scatter Plot
 
 In this example the `random` function draws a sample of records from the nyc311 (complaints database) collection where
 the complaint description matches "rat sighting" and latitude is populated in the record. The latitude and longitude fields
@@ -602,6 +604,8 @@ insight into the clusters of rat sightings throughout the five boroughs of New Y
 example it highlights a cluster of dense sightings in Brooklyn at cluster5 and cluster17,
 surrounded by less dense clusters.
 
+==== Plotting the Centroids
+
 The centroids of each cluster can then be easily plotted on a *map* to visualize the center of the
 clusters. In the example below the centroids are extracted from the clusters using the `getCentroids`
 function, which returns a matrix of the centroids.