You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Jens Bonerz <jb...@googlemail.com> on 2013/09/20 17:09:57 UTC

how to work with cluster data generated from the reuters example script

Hello all,

I have trouble figuring out how access the cluster data that are generated
in the reuters example script.

I am specifically interested in a plaintext export of the clustered data in
a csv like format for further processing (key:value, distances etc.).

The script already creates the "clusterdump" file. However that file seems
to be some kind of listing of frequent terms.

Can anyone guide me into the right direction? Once I can figure out how
that works, I hope to modify the script to read my custom sequence file
which contains key:value pairs (hash from URL, and text from html title tag)

Many thanks!
Jens <http://www.hightechmg.com>