You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2010/08/09 16:05:36 UTC
Clustering on EMR
Has anyone run Clustering (Kmeans) on EMR lately, per https://cwiki.apache.org/confluence/display/MAHOUT/Mahout+on+Elastic+MapReduce?
Here's what I ran, using the CLI,
./elastic-mapreduce -j j-31BXNQA7ATCCV --jar s3://news-vecs/mahout-core-0.4-SNAPSHOT.job --main-class org.apache.mahout.clustering.kmeans.KMeansDriver --arg "--input" --arg "s3://news-vecs/part-out.vec" --arg "--clusters" --arg s3://news-vecs/kmeans/clusters/ --arg "--k" --arg 10 --arg "--output" --arg s3://news-vecs/out/ --arg "--distanceMeasure" --arg "org.apache.mahout.common.distance.CosineDistanceMeasure" --arg "--convergenceDelta" --arg 0.001 --arg "--overwrite" --arg "--maxIter" --arg 50 --arg "--clustering"
It seems to run, but I don't see anything useful done and the out directory is definitely not created.
Anyone have insight?
Thanks,
Grant