You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Florents Tselai <fl...@gmail.com> on 2012/11/12 23:55:58 UTC

Clustering without hadoop

Hello,

I'm working on market basket data clustering and I'd like to know the
fastest(quick and dirty) way to use mahout.

Specifically, Is it possible to run Kmeans or EM without having a HDFS
configured?

If I'm not wrong Mahout 0.5 Kmeans implementation didn't require HDFS but
the latest version does.

Re: Clustering without hadoop

Posted by Johannes Schulte <jo...@gmail.com>.
Hi Florents,

it just became different but still works without hdfs, i also had trouble
getting the right classes together but here is something that will
hopefully work correctly:

  DistanceMeasure measure = new CosineDistanceMeasure();

  // ClusterUtils is no mahout class

  List<Cluster> initialClusters = ClusterUtils.getInitialClusters(points,
k, measure);


   System.out.println("going ma d!");


        ClusterClassifier prior =

                new ClusterClassifier(initialClusters,
newKMeansClusteringPolicy(0.01));


        ClusterClassifier clustered = ClusterIterator.iterate(points,
prior, 10);

        System.out.println(clustered.getModels());

        List<Cluster> finalClusters = clustered.getModels();


Cheers,


Johannes


On Mon, Nov 12, 2012 at 11:55 PM, Florents Tselai <florents.tselai@gmail.com
> wrote:

> Hello,
>
> I'm working on market basket data clustering and I'd like to know the
> fastest(quick and dirty) way to use mahout.
>
> Specifically, Is it possible to run Kmeans or EM without having a HDFS
> configured?
>
> If I'm not wrong Mahout 0.5 Kmeans implementation didn't require HDFS but
> the latest version does.
>