You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by sarath pr <sa...@gmail.com> on 2011/04/08 17:38:11 UTC

help_clusterdump

A text file created using the clusterdump utility has been attached
here. Can anyone tell me how to identify the document IDs belonging to
the cluster.?

mahout clusterdump -s
/home/sarathpr/NetBeansProjects/SNACK1/newsClusters/clusters/clusters-1
-o /home/sarathpr/Desktop/readable/out4.txt -b 100 -n 50 -p
/home/sarathpr/NetBeansProjects/SNACK1/newsClusters/clusters/clusteredPoints
-d /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/dictionary.file-0
-dt sequencefile


-- 
Thank You..!!
Sarath Ramachandran
sarath.amrita@gmail.com
+919995024287

Re: help_clusterdump

Posted by sarath pr <sa...@gmail.com>.

Thanks Madhusudan for your response. Its OK, i have solved the issue
by turning the last boolean argument of the below code to true.


DictionaryVectorizer.createTermFrequencyVectors(tokenizedPath, new
Path(outputDir), conf, minSupport, maxNGramSize, minLLRValue, 2, true,
reduceTasks,chunkSize, sequentialAccessOutput, false);


On 4/11/11, Madhusudan Joshi <ma...@gmail.com> wrote:
> I had similar issue before. I added the parameter --namedVector to the
> command to create named vectors. With that I was able to identify the which
> documents belonged to a given cluster using the same clusterdump command.
> Hope this helps.
>
> On Fri, Apr 8, 2011 at 9:23 PM, sarath pr <sa...@gmail.com> wrote:
>
>> A text file created using the clusterdump utility has been attached
>> here. Can anyone tell me how to identify the document IDs belonging to
>> the cluster.?
>>
>> mahout clusterdump -s
>> /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/clusters/clusters-1
>> -o /home/sarathpr/Desktop/readable/out4.txt -b 100 -n 50 -p
>>
>> /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/clusters/clusteredPoints
>> -d /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/dictionary.file-0
>> -dt sequencefile
>>
>>
>> --
>> Thank You..!!
>> Sarath Ramachandran
>> sarath.amrita@gmail.com
>> +919995024287
>>
>
>
>
> --
> Everything we hear is an opinion, not a fact.
> Everything we see is perspective, not the truth.
>


-- 
Thank You..!!
Sarath Ramachandran
sarath.amrita@gmail.com
+919995024287

Re: help_clusterdump

Posted by Madhusudan Joshi <ma...@gmail.com>.

I had similar issue before. I added the parameter --namedVector to the
command to create named vectors. With that I was able to identify the which
documents belonged to a given cluster using the same clusterdump command.
Hope this helps.

On Fri, Apr 8, 2011 at 9:23 PM, sarath pr <sa...@gmail.com> wrote:

> A text file created using the clusterdump utility has been attached
> here. Can anyone tell me how to identify the document IDs belonging to
> the cluster.?
>
> mahout clusterdump -s
> /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/clusters/clusters-1
> -o /home/sarathpr/Desktop/readable/out4.txt -b 100 -n 50 -p
>
> /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/clusters/clusteredPoints
> -d /home/sarathpr/NetBeansProjects/SNACK1/newsClusters/dictionary.file-0
> -dt sequencefile
>
>
> --
> Thank You..!!
> Sarath Ramachandran
> sarath.amrita@gmail.com
> +919995024287
>

-- 
Everything we hear is an opinion, not a fact.
Everything we see is perspective, not the truth.