You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jim Twensky <ji...@gmail.com> on 2010/01/26 08:27:12 UTC
Question on GroupingComparatorClass
Hi,
I'm using a custom grouping comparator class to simulate a secondary
sort on values, and I set it via Job.setGroupingComparatorClass (using
Hadoop 0.20.x) inside my driver. I'm wondering if this class is also
used when grouping the records in the combiner.
Using a combiner greatly improves the performance in my case, but for
the combiners, I want to use the default comparator, not the custom
one that I use before the actual reduce.
Is there a way to just set the custom grouping comparator for the
reduce and bypass it during the combine stage?
Thanks,
Jim
Re: Question on GroupingComparatorClass
Posted by Amogh Vasekar <am...@yahoo-inc.com>.
Hi,
I think combiner gets only the keys sort comparator, not the grouping comparator. So I believe the default grouping is used on combiner, but custom on reducer.
Here's a relevant snipped of code :
{
super(inputCounter, conf, reporter);
combinerClass = cls;
keyClass = (Class<K>) job.getMapOutputKeyClass();
valueClass = (Class<V>) job.getMapOutputValueClass();
comparator = (RawComparator<K>) job.getOutputKeyComparator();
}
Amogh
On 1/26/10 12:57 PM, "Jim Twensky" <ji...@gmail.com> wrote:
Hi,
I'm using a custom grouping comparator class to simulate a secondary
sort on values, and I set it via Job.setGroupingComparatorClass (using
Hadoop 0.20.x) inside my driver. I'm wondering if this class is also
used when grouping the records in the combiner.
Using a combiner greatly improves the performance in my case, but for
the combiners, I want to use the default comparator, not the custom
one that I use before the actual reduce.
Is there a way to just set the custom grouping comparator for the
reduce and bypass it during the combine stage?
Thanks,
Jim