You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Ted Dunning <te...@gmail.com> on 2009/12/23 21:51:15 UTC

Re: [jira] Commented: (MAHOUT-173) Implement clustering of massive-domain attributes

I never saw much progress on this.

On Wed, Dec 23, 2009 at 11:58 AM, Sean Owen (JIRA) <ji...@apache.org> wrote:

>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794197#action_12794197]
>
> Sean Owen commented on MAHOUT-173:
> ----------------------------------
>
> Pinging this issue -- is there any progress  in the past 3.5 months or
> should we shelve it?
>
> > Implement clustering of massive-domain attributes
> > -------------------------------------------------
> >
> >                 Key: MAHOUT-173
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-173
> >             Project: Mahout
> >          Issue Type: New Feature
> >          Components: Clustering
> >            Reporter: Matias Bjørling
> >            Priority: Trivial
> >   Original Estimate: 30h
> >  Remaining Estimate: 30h
> >
> > Implement the Clustering algorithm described in "A Framework for
> Clustering Massive-Domain Data Streams" by Chary C. Aggarwal.
> > Steps:
> > 1. Implement baseline solution to compare solutions.
> > 2. Figure out how to implement the loading of clustering by looking at
> the k-means implementation.
> > 3. Implement Count-Min sketch algorithm for each cluster.
> > 4. Find out how to give the user the power to choose the distance
> function for the input data ( Maybe already possible? )
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>


-- 
Ted Dunning, CTO
DeepDyve