You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Allen, Ronald L." <al...@ornl.gov> on 2014/01/31 13:55:24 UTC
Using Mahout to cluster a large CSV file
Hi all,
Has anyone had any success using Mahout kmeans to cluster a data in a single large CSV file? If so, how did you do it?
Thanks,
Ronnie
Re: Using Mahout to cluster a large CSV file
Posted by Bertrand Dechoux <de...@gmail.com>.
I guess the big (no pun intended) question is what is your definition of a
large CSV.
Bertrand
On Fri, Jan 31, 2014 at 2:17 PM, Suneel Marthi <su...@yahoo.com>wrote:
> Use Mahout's CSVVectorIterator.java to read ur input CSV file and generate
> vectors.
>
> You pass in a java.io.Reader to your CSV file and it generates Dense
> Vectors (from CSV).
>
> U could then feed the generated vectors into KMeans clustering.
>
>
>
>
> On Friday, January 31, 2014 7:55 AM, "Allen, Ronald L." <al...@ornl.gov>
> wrote:
>
> Hi all,
>
> Has anyone had any success using Mahout kmeans to cluster a data in a
> single large CSV file? If so, how did you do it?
>
> Thanks,
> Ronnie
>
RE: Using Mahout to cluster a large CSV file
Posted by "Allen, Ronald L." <al...@ornl.gov>.
Thank you for the response!
I will try this out and let you know how it goes!
________________________________________
From: Suneel Marthi [suneel_marthi@yahoo.com]
Sent: Friday, January 31, 2014 8:17 AM
To: user@mahout.apache.org
Subject: Re: Using Mahout to cluster a large CSV file
Use Mahout's CSVVectorIterator.java to read ur input CSV file and generate vectors.
You pass in a java.io.Reader to your CSV file and it generates Dense Vectors (from CSV).
U could then feed the generated vectors into KMeans clustering.
On Friday, January 31, 2014 7:55 AM, "Allen, Ronald L." <al...@ornl.gov> wrote:
Hi all,
Has anyone had any success using Mahout kmeans to cluster a data in a single large CSV file? If so, how did you do it?
Thanks,
Ronnie
Re: Using Mahout to cluster a large CSV file
Posted by Suneel Marthi <su...@yahoo.com>.
Use Mahout's CSVVectorIterator.java to read ur input CSV file and generate vectors.
You pass in a java.io.Reader to your CSV file and it generates Dense Vectors (from CSV).
U could then feed the generated vectors into KMeans clustering.
On Friday, January 31, 2014 7:55 AM, "Allen, Ronald L." <al...@ornl.gov> wrote:
Hi all,
Has anyone had any success using Mahout kmeans to cluster a data in a single large CSV file? If so, how did you do it?
Thanks,
Ronnie