You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by co...@apache.org on 2010/09/24 20:31:00 UTC

[CONF] Apache Mahout > RecommendationExamples

Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT)
Page: RecommendationExamples (https://cwiki.apache.org/confluence/display/MAHOUT/RecommendationExamples)


Edited by Sean Owen:
---------------------------------------------------------------------
h1. Introduction 

This quick start page describes how to run the recommendation examples provided by Mahout. Mahout comes with four recommendation mining examples. They are based on netflixx, jester, grouplens and bookcrossing respectively.

h1. Steps 

h2. Testing it on one single machine w/o cluster 

In the examples directory type: 
{code} 
mvn -q exec:java -Dexec.mainClass="org.apache.mahout.cf.taste.example.bookcrossing.BookCrossingRecommenderEvaluatorRunner" -Dexec.args="<OPTIONS>" 
mvn -q exec:java -Dexec.mainClass="org.apache.mahout.cf.taste.example.netflix.NetflixRecommenderEvaluatorRunner" -Dexec.args="<OPTIONS>" 
mvn -q exec:java -Dexec.mainClass="org.apache.mahout.cf.taste.example.netflix.TransposeToByUser" -Dexec.args="<OPTIONS>" 
mvn -q exec:java -Dexec.mainClass="org.apache.mahout.cf.taste.example.jester.JesterRecommenderEvaluatorRunner" -Dexec.args="<OPTIONS>" 
mvn -q exec:java -Dexec.mainClass="org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner" -Dexec.args="<OPTIONS>" 
{code} 

Note that the GroupLens example is designed for the "1 million" data set, available at http://www.grouplens.org/node/73 . This file has an unusual format and so has a special parser. The example code here can be easily modified to use a regular FileDataModel and thus work on more standard input, including the other data sets available at this site.

h2. Running it on the cluster 

* In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job will be generated in $MAHOUT_HOME/core/target/ and it's name will contain the Mahout version number. For example, when using Mahout 0.1 release, the job will be mahout-core-0.1.jar 
* (Optional) 1 Start up Hadoop: $HADOOP_HOME/bin/start-all.sh 
* Put the data: $HADOOP_HOME/bin/hadoop fs -put <PATH TO DATA> testdata 
* Run the Job: $HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/core/target/mahout-core-<MAHOUT VERSION>.job org.apache.mahout.cf.taste.example.<JOB> <OPTIONS> 
* Get the data out of HDFS and have a look. Use bin/hadoop fs -lsr output to view all outputs. 

h1. Command line options 
{code} 
Usage: <JOB>
--input (-i) input The Path for input preferences. This argument is optional except for the netflix example.
--help (-h) Print out help 
{code} 



Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action