You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Faizan(Aroha)" <fa...@arohalabs.net> on 2011/12/19 09:02:48 UTC
Clustering - k-means as a search
Hello,
I'm trying to implement k-means as a search.
I've performed k-means clustering on a huge dataset.
Now if I have a new (small)dataset or document , how will I determine with
which cluster it belongs?
Thanks in advance.
Regards,
Faizan Shaikh
Aroha Labs(Private) Ltd
Re: Clustering - k-means as a search
Posted by Jeff Eastman <jd...@windwardsolutions.com>.
The KMeansDriver has a method (clusterData) which you can invoke from a
Java program to cluster (classify) your new data with the old clusters.
You need to be sure the vectors are the same size (and the elements
denote the same attributes) for this to work. There is currently no CLI
to invoke this step independently from the buildClusters (training) step
and this is indeed under development.
As Paritosh indicates, we are planning to refactor all of the
clusterData implementations into an independent job so the redundant
implementations in the various clustering algorithms can be consolidated.
On 12/19/11 3:46 AM, Paritosh Ranjan wrote:
> This feature is in development.
>
> Try using ClusterClassifier. Populate it with the clusters you have as
> models.
> Then use ClusterIterator with KMeansClusteringPolicy.
>
> Hope it would solve your problem.
>
> On 19-12-2011 15:11, Faizan(Aroha) wrote:
>> Yes you are correct. Do you have any suggestions ?
>>
>> -----Original Message-----
>> From: Paritosh Ranjan [mailto:pranjan@xebia.com]
>> Sent: Monday, December 19, 2011 1:27 PM
>> To: user@mahout.apache.org
>> Subject: Re: Clustering - k-means as a search
>>
>> You want to classify the new vectors (smaller dataset) with the old
>> clusters ( huge dataset ). Am I correct?
>>
>> Paritosh
>>
>> On 19-12-2011 13:32, Faizan(Aroha) wrote:
>>> Hello,
>>>
>>>
>>>
>>> I'm trying to implement k-means as a search.
>>>
>>>
>>>
>>> I've performed k-means clustering on a huge dataset.
>>>
>>>
>>>
>>> Now if I have a new (small)dataset or document , how will I determine
>>> with which cluster it belongs?
>>>
>>>
>>>
>>> Thanks in advance.
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Faizan Shaikh
>>>
>>> Aroha Labs(Private) Ltd
>>>
>>>
>>>
>>>
>>> -----
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date:
>>> 12/18/11
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date: 12/18/11
>
>
>
Re: Clustering - k-means as a search
Posted by Paritosh Ranjan <pr...@xebia.com>.
This feature is in development.
Try using ClusterClassifier. Populate it with the clusters you have as
models.
Then use ClusterIterator with KMeansClusteringPolicy.
Hope it would solve your problem.
On 19-12-2011 15:11, Faizan(Aroha) wrote:
> Yes you are correct. Do you have any suggestions ?
>
> -----Original Message-----
> From: Paritosh Ranjan [mailto:pranjan@xebia.com]
> Sent: Monday, December 19, 2011 1:27 PM
> To: user@mahout.apache.org
> Subject: Re: Clustering - k-means as a search
>
> You want to classify the new vectors (smaller dataset) with the old
> clusters ( huge dataset ). Am I correct?
>
> Paritosh
>
> On 19-12-2011 13:32, Faizan(Aroha) wrote:
>> Hello,
>>
>>
>>
>> I'm trying to implement k-means as a search.
>>
>>
>>
>> I've performed k-means clustering on a huge dataset.
>>
>>
>>
>> Now if I have a new (small)dataset or document , how will I determine
>> with which cluster it belongs?
>>
>>
>>
>> Thanks in advance.
>>
>>
>>
>>
>>
>> Regards,
>>
>> Faizan Shaikh
>>
>> Aroha Labs(Private) Ltd
>>
>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date:
>> 12/18/11
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date: 12/18/11
RE: Clustering - k-means as a search
Posted by "Faizan(Aroha)" <fa...@arohalabs.net>.
Yes you are correct. Do you have any suggestions ?
-----Original Message-----
From: Paritosh Ranjan [mailto:pranjan@xebia.com]
Sent: Monday, December 19, 2011 1:27 PM
To: user@mahout.apache.org
Subject: Re: Clustering - k-means as a search
You want to classify the new vectors (smaller dataset) with the old
clusters ( huge dataset ). Am I correct?
Paritosh
On 19-12-2011 13:32, Faizan(Aroha) wrote:
> Hello,
>
>
>
> I'm trying to implement k-means as a search.
>
>
>
> I've performed k-means clustering on a huge dataset.
>
>
>
> Now if I have a new (small)dataset or document , how will I determine
> with which cluster it belongs?
>
>
>
> Thanks in advance.
>
>
>
>
>
> Regards,
>
> Faizan Shaikh
>
> Aroha Labs(Private) Ltd
>
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date:
> 12/18/11
Re: Clustering - k-means as a search
Posted by Paritosh Ranjan <pr...@xebia.com>.
You want to classify the new vectors (smaller dataset) with the old
clusters ( huge dataset ). Am I correct?
Paritosh
On 19-12-2011 13:32, Faizan(Aroha) wrote:
> Hello,
>
>
>
> I'm trying to implement k-means as a search.
>
>
>
> I've performed k-means clustering on a huge dataset.
>
>
>
> Now if I have a new (small)dataset or document , how will I determine with
> which cluster it belongs?
>
>
>
> Thanks in advance.
>
>
>
>
>
> Regards,
>
> Faizan Shaikh
>
> Aroha Labs(Private) Ltd
>
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date: 12/18/11