You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Allen, Ronald L." <al...@ornl.gov> on 2014/02/07 15:46:21 UTC

Clustering CSV

Hi,

I've been able to get a CSV file into a sequence file of vectors readable by Mahout.  I have ran mahout kmeans and it seems to work.  But when I run mahout clusterdump, it does not work because I do not have a dictionary.file-0.  Is there a way around this or a way to create this file myself?

Thanks,
Ronald

Re: Clustering CSV

Posted by Suneel Marthi <su...@yahoo.com>.
You wouldn't have a dictionary when creating vectors from CSV (via CsvIterator).
If u would like to see the documents that are part of cluster, try running the cluster output thru a seqdumper and that should give the document names (or points) that belong to a cluster.

You need to be working off of Mahout 0.9 or trunk to see the later working.




On Friday, February 7, 2014 9:51 AM, "Allen, Ronald L." <al...@ornl.gov> wrote:
 
Hi,

I've been able to get a CSV file into a sequence file of vectors readable by Mahout.  I have ran mahout kmeans and it seems to work.  But when I run mahout
 clusterdump, it does not work because I do not have a dictionary.file-0.  Is there a way around this or a way to create this file myself?

Thanks,
Ronald