You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jeff Eastman (JIRA)" <ji...@apache.org> on 2008/12/26 19:53:44 UTC
[jira] Updated: (MAHOUT-30) dirichlet process implementation
[ https://issues.apache.org/jira/browse/MAHOUT-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Eastman updated MAHOUT-30:
-------------------------------
Attachment: jeastman.vcf
Hi Isabel,
I'm so happy you had time to look through the code. Getting it to this
point was a great ordeal for me as the math is complicated and I have no
formal statistics background. Ted's help was critical in getting me to
the tipping point where I now understand the implementation well enough
to make progress on my own. I'm getting ready for a week vacation and
will not have email but would love to continue this dialog and am very
open to your suggestions below. See more comments therein.
Jeff
I was thinking of moving the display code into the examples directory.
I did that so Ted could use his favorite library but he has not been
pursuing it. I'm happy with blog and, as commons does not have the
needed sampling methods without Ted's patches, suggest we could go with
blog. Removing the plugability would clean up the code some too.
? Does this relate to maven?
Boy, I would too. Especially if it was clear enough that I could
understand it :)
Some of those were terms Ted introduced from my original port of his R
example. I'm not hung up but perhaps we should include him in the
discussion?
> dirichlet process implementation
> --------------------------------
>
> Key: MAHOUT-30
> URL: https://issues.apache.org/jira/browse/MAHOUT-30
> Project: Mahout
> Issue Type: New Feature
> Components: Clustering
> Reporter: Isabel Drost
> Assignee: Jeff Eastman
> Attachments: jeastman.vcf, MAHOUT-30.patch, MAHOUT-30b.patch, MAHOUT-30c.patch
>
>
> Copied over from original issue:
> > Further extension can also be made by assuming an infinite mixture model. The implementation is only slightly more difficult and the result is a (nearly)
> > non-parametric clustering algorithm.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.