You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jeff Eastman (JIRA)" <ji...@apache.org> on 2010/01/18 20:30:54 UTC

[jira] Resolved: (MAHOUT-251) Generalize Dirichlet models and model distributions to handle n-d and sparse vectors

     [ https://issues.apache.org/jira/browse/MAHOUT-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Eastman resolved MAHOUT-251.
---------------------------------

    Resolution: Fixed

r900519 wrapped up loose ends in the patch, adding new command line arguments to DirichletDriver and DirichletJob and fixing a serious defect in SquareRootFunction (using abs() vs sqrt()) which was causing all computed model stds to be way off. 

> Generalize Dirichlet models and model distributions to handle n-d and sparse vectors
> ------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-251
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-251
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Clustering
>    Affects Versions: 0.2
>            Reporter: Jeff Eastman
>            Assignee: Jeff Eastman
>         Attachments: MAHOUT-251-b.patch, MAHOUT-251.patch
>
>
> Users attempting to use Dirichlet Process Clustering on real life problems cannot use any of the existing models or model distributions as these have hard-coded assumptions of a 2-d DenseVector underlying data representation. These limitations are overly restrictive and the code needs to be generatlized.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.