You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Sznajder ForMailingList <bs...@gmail.com> on 2014/02/02 11:08:28 UTC

Mapping from docId to clusters in the clusterdump

Hi,

I have a directory containing thousands of text files.
I ran the KMeans cluster algorithm following the tutorial in the Mahout In
Action book.

However, I need to know which text file was mapped to which cluster.

I did not find the easy way to do that. I ran the clusterdump algorithm ,
but I succeed only to get mapping from vector to cluster, and not from
Document to Cluster.

Any help is welcome!

Benjamin

Re: Mapping from docId to clusters in the clusterdump

Posted by Suneel Marthi <su...@yahoo.com>.
Sedumper

See the comments in M-1410 for more info

Sent from my iPhone

> On Feb 2, 2014, at 5:31 AM, Sznajder ForMailingList <bs...@gmail.com> wrote:
> 
> Wow!
> 
> And what is please the command, I need to run for seeing mapping textId -->
> Cluster?
> 
> Benjamin
> 
> 
> On Sun, Feb 2, 2014 at 12:13 PM, Suneel Marthi <su...@yahoo.com>wrote:
> 
>> This was fixed as part of jira Mahout-1410.
>> 
>> Sent from my iPhone
>> 
>>>> On Feb 2, 2014, at 5:11 AM, Suneel Marthi <su...@yahoo.com>
>>> wrote:
>>> 
>>> This is an issue that was very recently fixed (infact fixed last week).
>> Please work off of present trunk, u should see the name of the text files
>> that r part of clusters.
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On Sunday, February 2, 2014 5:09 AM, Sznajder ForMailingList <
>>> bs4mailinglist@gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I have a directory containing thousands of text files.
>>> I ran the KMeans cluster algorithm following the tutorial in the Mahout
>> In
>>> Action book.
>>> 
>>> However, I need to know which text file was mapped to which cluster.
>>> 
>>> I did not find the easy way to do that. I ran the clusterdump algorithm ,
>>> but I succeed only to get mapping from vector to cluster, and not from
>>> Document to Cluster.
>>> 
>>> Any help is welcome!
>>> 
>>> Benjamin
>> 

Re: Mapping from docId to clusters in the clusterdump

Posted by Sznajder ForMailingList <bs...@gmail.com>.
Wow!

And what is please the command, I need to run for seeing mapping textId -->
Cluster?

Benjamin


On Sun, Feb 2, 2014 at 12:13 PM, Suneel Marthi <su...@yahoo.com>wrote:

> This was fixed as part of jira Mahout-1410.
>
> Sent from my iPhone
>
> > On Feb 2, 2014, at 5:11 AM, Suneel Marthi <su...@yahoo.com>
> wrote:
> >
> > This is an issue that was very recently fixed (infact fixed last week).
> Please work off of present trunk, u should see the name of the text files
> that r part of clusters.
> >
> >
> >
> >
> >
> > On Sunday, February 2, 2014 5:09 AM, Sznajder ForMailingList <
> bs4mailinglist@gmail.com> wrote:
> >
> > Hi,
> >
> > I have a directory containing thousands of text files.
> > I ran the KMeans cluster algorithm following the tutorial in the Mahout
> In
> > Action book.
> >
> > However, I need to know which text file was mapped to which cluster.
> >
> > I did not find the easy way to do that. I ran the clusterdump algorithm ,
> > but I succeed only to get mapping from vector to cluster, and not from
> > Document to Cluster.
> >
> > Any help is welcome!
> >
> > Benjamin
>

Re: Mapping from docId to clusters in the clusterdump

Posted by Suneel Marthi <su...@yahoo.com>.
This was fixed as part of jira Mahout-1410.

Sent from my iPhone

> On Feb 2, 2014, at 5:11 AM, Suneel Marthi <su...@yahoo.com> wrote:
> 
> This is an issue that was very recently fixed (infact fixed last week). Please work off of present trunk, u should see the name of the text files that r part of clusters.
> 
> 
> 
> 
> 
> On Sunday, February 2, 2014 5:09 AM, Sznajder ForMailingList <bs...@gmail.com> wrote:
> 
> Hi,
> 
> I have a directory containing thousands of text files.
> I ran the KMeans cluster algorithm following the tutorial in the Mahout In
> Action book.
> 
> However, I need to know which text file was mapped to which cluster.
> 
> I did not find the easy way to do that. I ran the clusterdump algorithm ,
> but I succeed only to get mapping from vector to cluster, and not from
> Document to Cluster.
> 
> Any help is welcome!
> 
> Benjamin

Re: Mapping from docId to clusters in the clusterdump

Posted by Suneel Marthi <su...@yahoo.com>.
This is an issue that was very recently fixed (infact fixed last week). Please work off of present trunk, u should see the name of the text files that r part of clusters.





On Sunday, February 2, 2014 5:09 AM, Sznajder ForMailingList <bs...@gmail.com> wrote:
 
Hi,

I have a directory containing thousands of text files.
I ran the KMeans cluster algorithm following the tutorial in the Mahout In
Action book.

However, I need to know which text file was mapped to which cluster.

I did not find the easy way to do that. I ran the clusterdump algorithm ,
but I succeed only to get mapping from vector to cluster, and not from
Document to Cluster.

Any help is welcome!

Benjamin