You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Clive Cox <cl...@rummble.com> on 2011/05/10 21:10:34 UTC

Mahout 542 on kddcup track2 data

Hi,

 I'm trying to test mahout 542 (ALS Matrix Factorization) on the kddcup
track2 data set and would like some feedback.

I am using the latest mahout 0.5 snapshot.

I converted the trainIdx2.txt data using
org.apache.mahout.cf.taste.example.kddcup.ToCSV

When training on this I get errors which seemed to be because the
ratings are in the range 0-100 and it wasn't liking the zero values.
So I hacked ratings of zero to be 1.

I trained using --numFeatures 20 --numIterations 10 --lambda 0.065

The training seemed to succeed and as a simple way to get a result set
for track2 I simply used predictFromFactorization to predict ratings for
testIdx2.txt and chose the top 3 ratings as '1' values in the result and
the other 3 as '0'.

However, the error for this was 49.9% which seems equivalent to a random
result.

Has anyone else tried mahout 542 on this data set and can provide
feedback?

 Thanks

 Clive