You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by jb...@apache.org on 2019/10/02 14:16:56 UTC

[lucene-solr] branch SOLR-13105-visual updated: SOLR-13105: Update machine learning docs 11

This is an automated email from the ASF dual-hosted git repository.

jbernste pushed a commit to branch SOLR-13105-visual
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git


The following commit(s) were added to refs/heads/SOLR-13105-visual by this push:
     new 9060aee  SOLR-13105: Update machine learning docs 11
9060aee is described below

commit 9060aee4d8e8cd4b14846dc9990d650e390fdb09
Author: Joel Bernstein <jb...@apache.org>
AuthorDate: Wed Oct 2 10:16:48 2019 -0400

    SOLR-13105: Update machine learning docs 11
---
 solr/solr-ref-guide/src/machine-learning.adoc | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/solr/solr-ref-guide/src/machine-learning.adoc b/solr/solr-ref-guide/src/machine-learning.adoc
index b107391..3f80a4e 100644
--- a/solr/solr-ref-guide/src/machine-learning.adoc
+++ b/solr/solr-ref-guide/src/machine-learning.adoc
@@ -780,22 +780,23 @@ allows vectors to be assigned to more then one cluster. The `fuzziness` paramete
 is a value between 1 and 2 that determines how fuzzy to make the cluster assignment.
 
 After the clustering has been performed the `getMembershipMatrix` function can be called
-on the clustering result to return a matrix describing which clusters each vector belongs to.
+on the clustering result to return a matrix describing the probabilities
+of cluster membership for each vector.
 This matrix can be used to understand relationships between clusters.
 
 In the example below `fuzzyKmeans` is used to cluster the movie reviews matching the phrase "star wars".
 But instead of looking at the clusters or centroids the `getMembershipMatrix` is used to return the
 membership probabilities for each document. The membership matrix is comprised of a row for each
 vector that was clustered. There is a column in the matrix for each cluster.
-The values in the matrix are the probability that the vector belongs to a specific cluster.
+The values in the matrix contain the probability that a specific vector belongs to a specific cluster.
 
-In the example the `corr` function is used to create a *correlation matrix* of the columns of the
+In the example the `corr` function is used to create a *correlation matrix* from the columns of the
 membership matrix. In other words the correlation matrix shows the correlation of the clusters
 based on the document co-occurrence in the clusters.
 
 Notice that in the example cluster3 and cluster5 are very highly correlated, which means that
 many documents had a probability of occurring in both clusters. Further analysis of the key features
-in both clusters can done to understand the reason how these cluster are interconnected.
+in both clusters can be performed to understand how these clusters are interconnected.
 
 [source,text]
 ----