You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by co...@apache.org on 2009/10/02 16:42:00 UTC

[CONF] Apache Lucene Mahout > ClassifyingYourData

Space: Apache Lucene Mahout (http://cwiki.apache.org/confluence/display/MAHOUT)
Page: ClassifyingYourData (http://cwiki.apache.org/confluence/display/MAHOUT/ClassifyingYourData)

Added by Isabel Drost:
---------------------------------------------------------------------
+*Mahout_0.2*+

After you've done the [QuickStart] and are familiar with the basics of Mahout, it is time to build a classifier from your own data. 

The following pieces *may* be useful for in getting started:

h1. Input

For starters, you will need your data in an appropriate Vector format (which has changed since Mahout 0.1)

* See [Creating Vectors]

h2. Text Preparation

* See [Creating Vectors from Text] 
* http://www.lucidimagination.com/search/document/4a0e528982b2dac3/document_clustering

h1. Running the Process

h2. Naive Bayes

Background: [bayesian | Naive Bayes Classification]

Documentation of running naive bayes from the command line: [bayesian-commandline]

h2. C-Bayes

Background: [https://issues.apache.org/jira/browse/MAHOUT-60 | C-Bayes Classification]

Documentation of running c-bayes from the command line: [c-bayes-commandline]

h2. Random Forests

Background: [random-forests | Random Forests Classification]

Documentation of running random forests from the command line: [random-forests-commandline]


Change your notification preferences: http://cwiki.apache.org/confluence/users/viewnotifications.action