You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Ted Dunning <te...@gmail.com> on 2009/12/23 21:51:15 UTC
Re: [jira] Commented: (MAHOUT-173) Implement clustering of
massive-domain attributes
I never saw much progress on this.
On Wed, Dec 23, 2009 at 11:58 AM, Sean Owen (JIRA) <ji...@apache.org> wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794197#action_12794197]
>
> Sean Owen commented on MAHOUT-173:
> ----------------------------------
>
> Pinging this issue -- is there any progress in the past 3.5 months or
> should we shelve it?
>
> > Implement clustering of massive-domain attributes
> > -------------------------------------------------
> >
> > Key: MAHOUT-173
> > URL: https://issues.apache.org/jira/browse/MAHOUT-173
> > Project: Mahout
> > Issue Type: New Feature
> > Components: Clustering
> > Reporter: Matias Bjørling
> > Priority: Trivial
> > Original Estimate: 30h
> > Remaining Estimate: 30h
> >
> > Implement the Clustering algorithm described in "A Framework for
> Clustering Massive-Domain Data Streams" by Chary C. Aggarwal.
> > Steps:
> > 1. Implement baseline solution to compare solutions.
> > 2. Figure out how to implement the loading of clustering by looking at
> the k-means implementation.
> > 3. Implement Count-Min sketch algorithm for each cluster.
> > 4. Find out how to give the user the power to choose the distance
> function for the input data ( Maybe already possible? )
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
--
Ted Dunning, CTO
DeepDyve