You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by co...@apache.org on 2009/03/17 23:36:00 UTC

[CONF] Apache Lucene Mahout: SyntheticControlData (page edited)

SyntheticControlData (MAHOUT) edited by Grant Ingersoll
      Page: http://cwiki.apache.org/confluence/display/MAHOUT/SyntheticControlData
   Changes: http://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=103663&originalVersion=1&revisedVersion=2






Content:
---------------------------------------------------------------------

h1. Introduction

This quick start page shows how to run the clustering Synthetic Control Data example.


h1. Steps

* Download the data at http://archive.ics.uci.edu/ml/datasets/Synthetic+Control+Chart+Time+Series
* In <MAHOUT_HOME>/, build the Job file: mvn install
* Start up Hadoop: <HADOOP_HOME>/bin/start-all.sh
* Put the data: <HADOOP_HOME>/bin/hadoop fs -put <PATH TO DATA> testdata
* Run the Job: <HADOOP_HOME>/bin/hadoop <MAHOUT_HOME>/examples/target/apache-mahout-examples-0.1-SNAPSHOT.job org.apache.mahout.clustering.syntheticcontrol.kmeans.Job //Substitute in whichever Clustring Job you want here: KMeans, Canopy, etc.
* Get the data out of HDFS and have a look.

---------------------------------------------------------------------
CONFLUENCE INFORMATION
This message is automatically generated by Confluence

Unsubscribe or edit your notifications preferences
   http://cwiki.apache.org/confluence/users/viewnotifications.action

If you think it was sent incorrectly contact one of the administrators
   http://cwiki.apache.org/confluence/administrators.action

If you want more information on Confluence, or have a bug to report see
   http://www.atlassian.com/software/confluence