You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Chisomo Sakala (JIRA)" <ji...@apache.org> on 2013/11/08 01:54:17 UTC

[jira] [Commented] (MAHOUT-1206) Add density-based clustering algorithms to mahout

    [ https://issues.apache.org/jira/browse/MAHOUT-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816858#comment-13816858 ] 

Chisomo Sakala commented on MAHOUT-1206:
----------------------------------------

I'm really excited about this prospect.

The paper <http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6253489> talks about how to implement MapReduce for DBScan (DBSCAN-MR). I have a pdf copy and can email it to anybody interested in viewing it. 

I emailed the author of that paper to find out if they'd be willing to contribute their coded implementation of DBSCAN-MR to Mahout, but  I haven't yet gotten a response.

Here is another paper discussing parellelization of DBSCAN.
<http://conferences.computer.org/sc/2012/papers/1000a053.pdf>







> Add density-based clustering algorithms to mahout
> -------------------------------------------------
>
>                 Key: MAHOUT-1206
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1206
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Yexi Jiang
>              Labels: clustering
>             Fix For: Backlog
>
>
> The clustering algorithms (kmeans, fuzzy kmeans, dirichlet clustering, and spectral cluster) clustering data by assuming that the data can be clustered into the regular hyper sphere or ellipsoid. However, in practical, not all the data can be clustered in this way. 
> To enable the data to be clustered in arbitrary shapes, clustering algorithms like DBSCAN, BIRCH, CLARANCE (http://en.wikipedia.org/wiki/Cluster_analysis#Density-based_clustering) are proposed.
> It is better that we can implement one or some of these clustering algorithm to enrich the clustering library. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)