You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Gmail <ma...@gmail.com> on 2014/03/04 11:43:08 UTC
kMeans Implementation
Hello,
I was studying Mahout libraries and I found something of strange in your
kMeans implementation.
I was looking inside it and I have noticed that kMeans only uses map
functions, omitting the reducers. Why have you done this choice?
It is not using MapReduce programming model even if it is declared that
the Mahout's core is Hadoop.
Is this choice driven by performance issue?
Best regards
Manuel Sequino
Re: kMeans Implementation
Posted by Suneel Marthi <su...@yahoo.com>.
He's talking about simple kmeans which is a mapper only job. Sean's already addressed his question
Sent from my iPhone
> On Mar 4, 2014, at 5:49 AM, Sebastian Schelter <ss...@apache.org> wrote:
>
> We have several implementations of k-Means, which one do you refer to?
>
> --sebastian
>
>> On 03/04/2014 11:43 AM, Gmail wrote:
>> Hello,
>> I was studying Mahout libraries and I found something of strange in your
>> kMeans implementation.
>>
>> I was looking inside it and I have noticed that kMeans only uses map
>> functions, omitting the reducers. Why have you done this choice?
>> It is not using MapReduce programming model even if it is declared that
>> the Mahout's core is Hadoop.
>> Is this choice driven by performance issue?
>>
>> Best regards
>> Manuel Sequino
>
Re: kMeans Implementation
Posted by Sebastian Schelter <ss...@apache.org>.
We have several implementations of k-Means, which one do you refer to?
--sebastian
On 03/04/2014 11:43 AM, Gmail wrote:
> Hello,
> I was studying Mahout libraries and I found something of strange in your
> kMeans implementation.
>
> I was looking inside it and I have noticed that kMeans only uses map
> functions, omitting the reducers. Why have you done this choice?
> It is not using MapReduce programming model even if it is declared that
> the Mahout's core is Hadoop.
> Is this choice driven by performance issue?
>
> Best regards
> Manuel Sequino
>
>
Re: kMeans Implementation
Posted by Sam Bessalah <sa...@gmail.com>.
I don't see why is that a problem.
On Tue, Mar 4, 2014 at 11:43 AM, Gmail <ma...@gmail.com> wrote:
> Hello,
> I was studying Mahout libraries and I found something of strange in your
> kMeans implementation.
>
> I was looking inside it and I have noticed that kMeans only uses map
> functions, omitting the reducers. Why have you done this choice?
> It is not using MapReduce programming model even if it is declared that
> the Mahout's core is Hadoop.
> Is this choice driven by performance issue?
>
> Best regards
> Manuel Sequino
>
>
>
Re: kMeans Implementation
Posted by Gmail <ma...@gmail.com>.
I used the kMeansDriver class, in clustering.kmeans package.
Yes I know that the use of MapReduce is mandatory, but I think that
exists an easier implementation and especially mapreduce oriented.
Anyway, I thought it was a choice driven by performances.
Thank you.
On 03/04/2014 11:48 AM, Sean Owen wrote:
> Although I don't know exactly what you're referring to, in general,
> nothing about Map/Reduce means you always use a reducer. There are
> plenty of tasks that are much more appropriate as a map-only or
> reduce-only job. So this assertion doesn't fly to start with. But if
> you see two jobs that might be merged into one, that could be a useful
> suggestion.
>
> On Tue, Mar 4, 2014 at 10:43 AM, Gmail <ma...@gmail.com> wrote:
>> Hello,
>> I was studying Mahout libraries and I found something of strange in your
>> kMeans implementation.
>>
>> I was looking inside it and I have noticed that kMeans only uses map
>> functions, omitting the reducers. Why have you done this choice?
>> It is not using MapReduce programming model even if it is declared that the
>> Mahout's core is Hadoop.
>> Is this choice driven by performance issue?
>>
>> Best regards
>> Manuel Sequino
>>
>>
> .
>
Re: kMeans Implementation
Posted by Sean Owen <sr...@gmail.com>.
Although I don't know exactly what you're referring to, in general,
nothing about Map/Reduce means you always use a reducer. There are
plenty of tasks that are much more appropriate as a map-only or
reduce-only job. So this assertion doesn't fly to start with. But if
you see two jobs that might be merged into one, that could be a useful
suggestion.
On Tue, Mar 4, 2014 at 10:43 AM, Gmail <ma...@gmail.com> wrote:
> Hello,
> I was studying Mahout libraries and I found something of strange in your
> kMeans implementation.
>
> I was looking inside it and I have noticed that kMeans only uses map
> functions, omitting the reducers. Why have you done this choice?
> It is not using MapReduce programming model even if it is declared that the
> Mahout's core is Hadoop.
> Is this choice driven by performance issue?
>
> Best regards
> Manuel Sequino
>
>