You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Jie Li <ji...@cs.duke.edu> on 2012/07/12 04:43:21 UTC

Hash aggregation experience

Hi all,

Has anyone tried the hash aggregation feature in pig 0.10 and seen any
performance improvement? Recently I'm benchmarking HashAgg and the combiner
to see whether we should use HashAgg more aggresively, given that it has
lower overhead then the combiner and more flexibility that it can
auto-disable itself while the combiner can't.

Some of my benchmark results can be found in
https://cwiki.apache.org/confluence/display/PIG/Pig+Performance+Optimization#PigPerformanceOptimization-HashAggvs.Combiner.
Any comment is appreciated!

Jie