You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by chalitha udara Perera <ch...@gmail.com> on 2014/03/14 07:36:02 UTC

Developing a Kohonen Network for Mahout

Hi everyone,


I recently had the opportunity to work with Apache Mahout in developing
document clustering component for a CMS. My current research interests
include the use of Neural Nets for text clustering. With Mahout 0.9 moving
towards the use of Neural Networks (MLP Classifier), I thought it would be
interesting to have probably the most widely used unsupervised neural
network, the Kohonen Network (Self Organizing Map) based clustering module
for Mahout.


There are many variants of algorithms developed based on the idea of SOM
[1]. Growing Self Organizing Map (GSOM) is another variant of SOM which
provides the solutions to some of the limitations of SOM and an ideal
candidate for hierarchical clustering [2].


I also went through the related JIRA issues regarding this
(MAHOUT-64<https://issues.apache.org/jira/browse/MAHOUT-64>,
MAHOUT-1344 <https://issues.apache.org/jira/browse/MAHOUT-1344>). If
possible I would like to contribute in developing Self Organizing Maps for
Mahout probably starting with the online version of the SOM.


Any comments, opinions on this matter are highly appreciated.


[1] R.D. Lawrence, G.S. Almasi, H.E. Rushmeier, "A Scalable Parallel
Algorithm for Self-Organizing Maps with Applications to Sparse Data Mining
Problems"

[2] Toby Smith, Damminda Alahakoon, "Growing Self-Organizing Map for Online
Continuous Clustering"


Thanks,

Chalitha