You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by bu...@apache.org on 2013/11/21 11:34:51 UTC

svn commit: r887476 - in /websites/staging/mahout/trunk/content: ./ users/clustering/cluster-dumper.html

Author: buildbot
Date: Thu Nov 21 10:34:51 2013
New Revision: 887476

Log:
Staging update by buildbot for mahout

Modified:
    websites/staging/mahout/trunk/content/   (props changed)
    websites/staging/mahout/trunk/content/users/clustering/cluster-dumper.html

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Nov 21 10:34:51 2013
@@ -1 +1 @@
-1544091
+1544093

Modified: websites/staging/mahout/trunk/content/users/clustering/cluster-dumper.html
==============================================================================
--- websites/staging/mahout/trunk/content/users/clustering/cluster-dumper.html (original)
+++ websites/staging/mahout/trunk/content/users/clustering/cluster-dumper.html Thu Nov 21 10:34:51 2013
@@ -414,32 +414,29 @@ the case where you've executed a cluster
 and will have sub-folders for each cluster outputs and ClusteredPoints</p>
 <p>Run the clusterdump utility as follows as a standalone Java Program through Eclipse - if you are using eclipse, setup mahout-utils as a project as specified in <a href="../developers/buildingmahout.html">Working with Maven in Eclipse</a>.
     To execute ClusterDumper.java,</p>
-<div class="codehilite"><pre><span class="o">*</span> <span class="n">Under</span> <span class="n">mahout</span><span class="o">-</span><span class="n">utils</span><span class="p">,</span> <span class="n">Right</span><span class="o">-</span><span class="n">Click</span> <span class="n">on</span> <span class="n">ClusterDumper</span><span class="p">.</span><span class="n">java</span>
-<span class="o">*</span> <span class="n">Choose</span> <span class="n">Run</span><span class="o">-</span><span class="n">As</span><span class="p">,</span> <span class="n">Run</span> <span class="n">Configurations</span>
-<span class="o">*</span> <span class="n">On</span> <span class="n">the</span> <span class="n">left</span> <span class="n">menu</span><span class="p">,</span> <span class="n">click</span> <span class="n">on</span> <span class="n">Java</span> <span class="n">Application</span>
-<span class="o">*</span> <span class="n">On</span> <span class="n">the</span> <span class="n">top</span><span class="o">-</span><span class="n">bar</span> <span class="n">click</span> <span class="n">on</span> &quot;<span class="n">New</span> <span class="n">Launch</span> <span class="n">Configuration</span>&quot;
-<span class="o">*</span> <span class="n">A</span> <span class="n">new</span> <span class="n">launch</span> <span class="n">should</span> <span class="n">be</span> <span class="n">automatically</span> <span class="n">created</span> <span class="n">with</span> <span class="n">project</span> <span class="n">as</span>
-
-&quot;<span class="n">mahout</span><span class="o">-</span><span class="n">utils</span>&quot; <span class="n">and</span> <span class="n">Main</span> <span class="n">Class</span> <span class="n">as</span> &quot;<span class="n">org</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">mahout</span><span class="p">.</span><span class="n">utils</span><span class="p">.</span><span class="n">clustering</span><span class="p">.</span><span class="n">ClusterDumper</span>&quot;
-
-<span class="o">*</span> <span class="n">In</span> <span class="n">the</span> <span class="n">arguments</span> <span class="n">tab</span><span class="p">,</span> <span class="n">specify</span> <span class="n">the</span> <span class="n">below</span> <span class="n">arguments</span>
-
-<span class="o">--</span><span class="n">seqFileDir</span> <span class="o">&lt;</span><span class="n">MAHOUT_HOME</span><span class="o">&gt;/</span><span class="n">examples</span><span class="o">/</span><span class="n">output</span><span class="o">/</span><span class="n">clusters</span><span class="o">-</span>10 
-<span class="o">--</span><span class="n">pointsDir</span> <span class="o">&lt;</span><span class="n">MAHOUT_HOME</span><span class="o">&gt;/</span><span class="n">examples</span><span class="o">/</span><span class="n">output</span><span class="o">/</span><span class="n">clusteredPoints</span> 
-<span class="o">--</span><span class="n">output</span> <span class="o">&lt;</span><span class="n">MAHOUT_HOME</span><span class="o">&gt;/</span><span class="n">examples</span><span class="o">/</span><span class="n">output</span><span class="o">/</span><span class="n">clusteranalyze</span><span class="p">.</span><span class="n">txt</span>
-
-<span class="n">replace</span> <span class="o">&lt;</span><span class="n">MAHOUT_HOME</span><span class="o">&gt;</span> <span class="n">with</span> <span class="n">the</span> <span class="n">actual</span> <span class="n">path</span> <span class="n">of</span> <span class="n">your</span> $<span class="n">MAHOUT_HOME</span>
-
-<span class="o">*</span> <span class="n">Hit</span> <span class="n">run</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">ClusterDumper</span> <span class="n">using</span> <span class="n">Eclipse</span><span class="p">.</span> <span class="n">Setting</span> <span class="n">breakpoints</span> <span class="n">etc</span> <span class="n">should</span> <span class="n">just</span> <span class="n">work</span> <span class="n">fine</span><span class="p">.</span>
-</pre></div>
-
-
+<ul>
+<li>Under mahout-utils, Right-Click on ClusterDumper.java</li>
+<li>Choose Run-As, Run Configurations</li>
+<li>On the left menu, click on Java Application</li>
+<li>On the top-bar click on "New Launch Configuration"</li>
+<li>
+<p>A new launch should be automatically created with project as</p>
+<p>"mahout-utils" and Main Class as "org.apache.mahout.utils.clustering.ClusterDumper"</p>
+</li>
+<li>
+<p>In the arguments tab, specify the below arguments</p>
+<p>--seqFileDir <MAHOUT_HOME>/examples/output/clusters-10 
+--pointsDir <MAHOUT_HOME>/examples/output/clusteredPoints 
+--output <MAHOUT_HOME>/examples/output/clusteranalyze.txt</p>
+<p>replace <MAHOUT_HOME> with the actual path of your $MAHOUT_HOME</p>
+</li>
+<li>
+<p>Hit run to execute the ClusterDumper using Eclipse. Setting breakpoints etc should just work fine.</p>
+</li>
+</ul>
 <p>Reading the output file</p>
-<div class="codehilite"><pre><span class="n">This</span> <span class="n">will</span> <span class="n">output</span> <span class="n">the</span> <span class="n">clusters</span> <span class="n">into</span> <span class="n">a</span> <span class="n">file</span> <span class="n">called</span> <span class="n">clusteranalyze</span><span class="p">.</span><span class="n">txt</span> <span class="n">inside</span> $<span class="n">MAHOUT_HOME</span><span class="o">/</span><span class="n">examples</span><span class="o">/</span><span class="n">output</span>
-<span class="n">Sample</span> <span class="n">data</span> <span class="n">will</span> <span class="n">look</span> <span class="n">like</span>
-</pre></div>
-
-
+<p>This will output the clusters into a file called clusteranalyze.txt inside $MAHOUT_HOME/examples/output
+Sample data will look like</p>
 <p>CL-0 { n=116 c=<a href="29.922,-30.407,-30.373,-30.094,-29.886,-29.937,-29.751,-30.054,-30.039,-30.126,-29.764,-29.835,-30.503,-29.876,-29.990,-29.605,-29.379,-30.120,-29.882,-30.161,-29.825,-30.074,-30.001,-30.421,-29.867,-29.736,-29.760,-30.192,-30.134,-30.082,-29.962,-29.512,-29.736,-29.594,-29.493,-29.761,-29.183,-29.517,-29.273,-29.161,-29.215,-29.731,-29.154,-29.113,-29.348,-28.981,-29.543,-29.192,-29.479,-29.406,-29.715,-29.344,-29.628,-29.074,-29.347,-29.812,-29.058,-29.177,-29.063,-29.607.html">29.922, 30.407, 30.373, 30.094, 29.886, 29.937, 29.751, 30.054, 30.039, 30.126, 29.764, 29.835, 30.503, 29.876, 29.990, 29.605, 29.379, 30.120, 29.882, 30.161, 29.825, 30.074, 30.001, 30.421, 29.867, 29.736, 29.760, 30.192, 30.134, 30.082, 29.962, 29.512, 29.736, 29.594, 29.493, 29.761, 29.183, 29.517, 29.273, 29.161, 29.215, 29.731, 29.154, 29.113, 29.348, 28.981, 29.543, 29.192, 29.479, 29.406, 29.715, 29.344, 29.628, 29.074, 29.347, 29.812, 29.058, 29.177, 29.063, 29.607</a>
  r=[3.463, 3.351, 3.452, 3.438, 3.371, 3.569, 3.253, 3.531, 3.439, 3.472,
 3.402, 3.459, 3.320, 3.260, 3.430, 3.452, 3.320, 3.499, 3.302, 3.511,
@@ -447,13 +444,9 @@ and will have sub-folders for each clust
 3.651, 3.833, 3.812, 3.433, 4.133, 3.855, 4.123, 3.999, 4.467, 4.731,
 4.539, 4.956, 4.644, 4.382, 4.277, 4.918, 4.784, 4.582, 4.915, 4.607,
 4.672, 4.577, 5.035, 5.241, 4.731, 4.688, 4.685, 4.657, 4.912, 4.300] }</p>
-<div class="codehilite"><pre><span class="n">and</span> <span class="n">on</span><span class="p">...</span>
-
-<span class="n">where</span> <span class="n">CL</span><span class="o">-</span>0 <span class="n">is</span> <span class="n">the</span> <span class="n">Cluster</span> 0 <span class="n">and</span> <span class="n">n</span><span class="p">=</span>116 <span class="n">refers</span> <span class="n">to</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">points</span> <span class="n">observed</span> <span class="n">by</span> <span class="n">this</span> <span class="n">cluster</span> <span class="n">and</span> <span class="n">c</span> <span class="p">=</span> <span class="o">\</span><span class="p">[</span>29<span class="p">.</span>922 <span class="p">..</span><span class="o">.\</span><span class="p">]</span>
-</pre></div>
-
-
-<p>refers to the center of Cluster as a vector and r = [3.463 ..] refers to
+<p>and on...</p>
+<p>where CL-0 is the Cluster 0 and n=116 refers to the number of points observed by this cluster and c = [29.922 ...]
+ refers to the center of Cluster as a vector and r = [3.463 ..] refers to
 the radius of the cluster as a vector.</p>
    </div>
   </div>