You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by jb...@apache.org on 2019/10/01 18:04:24 UTC

[lucene-solr] branch SOLR-13105-visual updated: SOLR-13105: Update machine learning docs 5

This is an automated email from the ASF dual-hosted git repository.

jbernste pushed a commit to branch SOLR-13105-visual
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git


The following commit(s) were added to refs/heads/SOLR-13105-visual by this push:
     new b1a5a6a  SOLR-13105: Update machine learning docs 5
b1a5a6a is described below

commit b1a5a6a7562c814b0e8d0f3a4c2fd4c34837f588
Author: Joel Bernstein <jb...@apache.org>
AuthorDate: Tue Oct 1 14:04:03 2019 -0400

    SOLR-13105: Update machine learning docs 5
---
 .../src/images/math-expressions/2Dcentroids.png    | Bin 0 -> 2637377 bytes
 .../src/images/math-expressions/2Dcluster.png      | Bin 0 -> 512321 bytes
 .../src/images/math-expressions/centroidzoom.png   | Bin 0 -> 2159995 bytes
 solr/solr-ref-guide/src/machine-learning.adoc      |  34 ++++++++++++++++++++-
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/solr/solr-ref-guide/src/images/math-expressions/2Dcentroids.png b/solr/solr-ref-guide/src/images/math-expressions/2Dcentroids.png
new file mode 100644
index 0000000..8d574f5
Binary files /dev/null and b/solr/solr-ref-guide/src/images/math-expressions/2Dcentroids.png differ
diff --git a/solr/solr-ref-guide/src/images/math-expressions/2Dcluster.png b/solr/solr-ref-guide/src/images/math-expressions/2Dcluster.png
new file mode 100644
index 0000000..892a34b
Binary files /dev/null and b/solr/solr-ref-guide/src/images/math-expressions/2Dcluster.png differ
diff --git a/solr/solr-ref-guide/src/images/math-expressions/centroidzoom.png b/solr/solr-ref-guide/src/images/math-expressions/centroidzoom.png
new file mode 100644
index 0000000..9b98938
Binary files /dev/null and b/solr/solr-ref-guide/src/images/math-expressions/centroidzoom.png differ
diff --git a/solr/solr-ref-guide/src/machine-learning.adoc b/solr/solr-ref-guide/src/machine-learning.adoc
index 8599102..44754a1 100644
--- a/solr/solr-ref-guide/src/machine-learning.adoc
+++ b/solr/solr-ref-guide/src/machine-learning.adoc
@@ -575,6 +575,38 @@ The `kmeans` functions performs k-means clustering of the rows of a matrix.
 Once the clustering has been completed there are a number of useful functions available
 for examining the clusters and centroids.
 
+
+=== 2D Cluster Visualization
+
+The `zplot` function has direct support for plotting 2D clusters by using the *clusters* named parameter.
+The example demonstrates this capability by clustering and visualizing latitude and longitude points.
+
+In this example the `random` function draws a sample of records from the nyc311 (complaints database) collection where
+the complaint description matches "rat sighting" and latitude is populated in the record. The latitude and longitude fields
+are then vectorized and added as rows to a matrix. The matrix is transposed so each row contains a single latitude, longitude
+point. The `kmeans` function is then used to cluster the latitude and longitude points into 5 clusters. The `zplot` function
+is then used visualize the clusters as a scatter chart.
+
+image::images/math-expressions/2Dcluster.png[]
+
+The scatter plot above shows each lat/lon point plotted on a Euclidean plain. Each cluster is shown in
+a different. This plot provides a significant amount of information about size, shape and dispersion
+of the different clusters.
+
+The centroids of each cluster can then be easily plotted on a *map* to visualize the center of the
+clusters. In the example below the centroids are extracted from the clusters using the `getCentroids`
+function, which returns a matrix of the centroids. In example the matrix contains 2D lan/lon points.
+
+The `colAt` function is then used to extract the latitude and longitude columns by index from the matrix, which
+are plotted using `zplot`. A map visualization is used display the centroid clusters.
+
+image::images/math-expressions/2Dcentroids.png[]
+
+The map can be zoomed to understand clearly where the center of clusters lie on the map.
+
+image::images/math-expressions/centroidzoom.png[]
+
+
 === Phrase Extraction
 
 In the example below the `kmeans` function is used to cluster a result set from a movie review data-set
@@ -646,7 +678,7 @@ When this expression is sent to the `/stream` handler it responds with:
 }
 ----
 
-=== 2D Cluster Visualization
+
 
 == Multi K-Means Clustering