You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Artem Barger (JIRA)" <ji...@apache.org> on 2016/05/30 22:13:12 UTC
[jira] [Created] (MATH-1371) Provide accelerated kmeans++
implementation
Artem Barger created MATH-1371:
----------------------------------
Summary: Provide accelerated kmeans++ implementation
Key: MATH-1371
URL: https://issues.apache.org/jira/browse/MATH-1371
Project: Commons Math
Issue Type: Improvement
Reporter: Artem Barger
Assignee: Artem Barger
There is an updated version of kmeans++ algorithm available, which is published in: Elkan, Charles. "Using the triangle inequality to accelerate k-means." ICML. Vol. 3. 2003. paper.
The main essence is to boost the kmeans iterations by avoiding computation of distances between centers and points when there is no need for that. For example after the update cluster center haven't moved too far from the point therefore no change in point assignment. The accelerated algorithm avoids unnecessary distance calculations by applying the triangle inequality in two different ways, and by keeping track of lower and upper bounds for distances
between points and centers.
Algorithm description is available in the paper.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)