You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by "Jeff Eastman (JIRA)" <ji...@apache.org> on 2008/12/26 19:53:44 UTC

[jira] Updated: (MAHOUT-30) dirichlet process implementation

     [ https://issues.apache.org/jira/browse/MAHOUT-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Eastman updated MAHOUT-30:
-------------------------------

    Attachment: jeastman.vcf

Hi Isabel,

I'm so happy you had time to look through the code. Getting it to this 
point was a great ordeal for me as the math is complicated and I have no 
formal statistics background. Ted's help was critical in getting me to 
the tipping point where I now understand the implementation well enough 
to make progress on my own. I'm getting ready for a week vacation and 
will not have email but would love to continue this dialog and am very 
open to your suggestions below. See more comments therein.

Jeff

I was thinking of moving the display code into the examples directory.
I did that so Ted could use his favorite library but he has not been 
pursuing it. I'm happy with blog and, as commons does not have the 
needed sampling methods without Ted's patches, suggest we could go with 
blog. Removing the plugability would clean up the code some too.
? Does this relate to maven?
Boy, I would too. Especially if it was clear enough that I could 
understand it :)
Some of those were terms Ted introduced from my original port of his R 
example. I'm not hung up but perhaps we should include him in the 
discussion?



> dirichlet process implementation
> --------------------------------
>
>                 Key: MAHOUT-30
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-30
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Clustering
>            Reporter: Isabel Drost
>            Assignee: Jeff Eastman
>         Attachments: jeastman.vcf, MAHOUT-30.patch, MAHOUT-30b.patch, MAHOUT-30c.patch
>
>
> Copied over from original issue:
> > Further extension can also be made by assuming an infinite mixture model. The implementation is only slightly more difficult and the result is a (nearly)
> > non-parametric clustering algorithm.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.