You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by santosh_rajaguru <sa...@gmail.com> on 2015/05/21 16:42:23 UTC

difference between reducefunction and GroupReduceFunction

i am new to flink and map reduce. My query is 
Apart from incrementally combing 2 elements, what are the merits of using
reduceFunction over GroupReduceFunction. which usecases suits what functions
the most!!!


 



--
View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/difference-between-reducefunction-and-GroupReduceFunction-tp5768.html
Sent from the Apache Flink Mailing List archive. mailing list archive at Nabble.com.

Re: difference between reducefunction and GroupReduceFunction

Posted by Stephan Ewen <se...@apache.org>.
Performance-wise, a "GroupReduceFunction" with Combiner should right not be
slightly faster than the ReduceFunction, but not much.

Long term, the ReduceFunction may become faster, because it will use hash
aggregation under the hood.


On Fri, May 22, 2015 at 11:58 AM, santosh_rajaguru <sa...@gmail.com>
wrote:

> Thanks Maximilian.
>
> My use case is similar to the example given in the graph analysis.
> In graph analysis, the reduce function used is a normal reduce function.
> I executed that with both scenarios and your justification is right. the
> normal reduce function have a combiner before sorting unlike the
> GroupReduce
> function.
> my question, how is it effecting the performance as the result is same in
> both the situation.
>
>
> Thanks and Regards,
> Santosh
>
>
>
>
>
> --
> View this message in context:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/difference-between-reducefunction-and-GroupReduceFunction-tp5768p5785.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at
> Nabble.com.
>

Re: difference between reducefunction and GroupReduceFunction

Posted by santosh_rajaguru <sa...@gmail.com>.
Thanks Maximilian.

My use case is similar to the example given in the graph analysis.
In graph analysis, the reduce function used is a normal reduce function.
I executed that with both scenarios and your justification is right. the
normal reduce function have a combiner before sorting unlike the GroupReduce
function. 
my question, how is it effecting the performance as the result is same in
both the situation. 


Thanks and Regards,
Santosh





--
View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/difference-between-reducefunction-and-GroupReduceFunction-tp5768p5785.html
Sent from the Apache Flink Mailing List archive. mailing list archive at Nabble.com.

Re: difference between reducefunction and GroupReduceFunction

Posted by Maximilian Michels <mx...@apache.org>.
Pardon, what I said is not completely right. Both functions are
incrementally constructed. This seems obvious for the reduce function but
is also true for the GroupReduce because it receives the values as an
Iterable which, under the hood, can be constructed incrementally as well.

One other difference is that the traditional reduce always applies a
combiner before shuffling the results. The GroupReduceFunction, on the
other hand, does not do that unless you explicitly specify a combiner using
the RichGroupReduceFunction or perform a GroupCombine operation before the
GroupReduce.

Best regards,
Max


On Fri, May 22, 2015 at 10:03 AM, Maximilian Michels <mx...@apache.org> wrote:

> Like you said, it depends on the use case. The GroupReduceFunction is a
> generalization of the traditional reduce. Thus, it is more powerful.
> However, it is also executed differently; a GroupReduceFunction requires
> the whole group to be materialized and passed at once. If your program
> doesn't require that, use the normal reduce function.
>
> On Thu, May 21, 2015 at 4:42 PM, santosh_rajaguru <sa...@gmail.com>
> wrote:
>
>> i am new to flink and map reduce. My query is
>> Apart from incrementally combing 2 elements, what are the merits of using
>> reduceFunction over GroupReduceFunction. which usecases suits what
>> functions
>> the most!!!
>>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/difference-between-reducefunction-and-GroupReduceFunction-tp5768.html
>> Sent from the Apache Flink Mailing List archive. mailing list archive at
>> Nabble.com.
>>
>
>

Re: difference between reducefunction and GroupReduceFunction

Posted by Maximilian Michels <mx...@apache.org>.
Like you said, it depends on the use case. The GroupReduceFunction is a
generalization of the traditional reduce. Thus, it is more powerful.
However, it is also executed differently; a GroupReduceFunction requires
the whole group to be materialized and passed at once. If your program
doesn't require that, use the normal reduce function.

On Thu, May 21, 2015 at 4:42 PM, santosh_rajaguru <sa...@gmail.com> wrote:

> i am new to flink and map reduce. My query is
> Apart from incrementally combing 2 elements, what are the merits of using
> reduceFunction over GroupReduceFunction. which usecases suits what
> functions
> the most!!!
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/difference-between-reducefunction-and-GroupReduceFunction-tp5768.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at
> Nabble.com.
>