You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Jeff Eastman <jd...@windwardsolutions.com> on 2010/07/19 21:45:56 UTC

Re: svn commit: r965587 [1/2] - in /mahout/trunk: core/src/main/java/org/apache/mahout/clustering/canopy/ core/src/main/java/org/apache/mahout/clustering/dirichlet/ core/src/main/java/org/apache/mahout/clustering/fuzzykmeans/ core/src/main/java/org/apache/...

On 7/19/10 12:06 PM, jeastman@apache.org wrote:
> Author: jeastman
> Date: Mon Jul 19 19:06:14 2010
> New Revision: 965587
>
> URL: http://svn.apache.org/viewvc?rev=965587&view=rev
> Log:
> MAHOUT-294:
> - added argMap local to AbstractJob to allow option accessing abstraction
> - added hasOption(), getOption() and optionKey() to AbstractJob to encapsulate all -- prepending
> - revised all clustering components to use hasOption()/getOption()
> - added MahoutTestCase.optKey() to use optionKey() to encapsulate all -- prepending
> - decided not to continue with public addOptions() approach in earlier patch
> - removed all public OPTION_KEY constants in DefaultOptionsCreator but made
>      respective OPTION constants public
> - revised all clustering tests to use optKey
> - made interClusterDensity, intraClusterDensity and separation() public in CDbwEvaluator
>    
Hi Robin,

I just read over MIA chapter 10 and thought you might also want to 
mention the CDbwEvaluator introduced in MAHOUT-236. I've just made the 
intermediate calculations public and they appear to line up pretty well 
with the cluster evaluation criteria in the chapter so users can 
potentially use these methods to compute cluster quality metrics too. 
The benefit of these evaluator methods is they can compute after 
selecting n, representative points from potentially-large clusters. It 
would be nice to get some more eyeballs on these calculations too to 
make sure I didn't miss something in translating from the paper.

Jeff