You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2008/03/20 23:56:47 UTC

Default Combiner or default combining behaviour?

Hi,

The MapReduce tutorial mentions Combiners only in passing.  Is there a default Combiner or default combining behaviour?

Concretely, I want to make sure that records are not getting combined behind the scenes in some way without me seeing it, and causing me to lose data.  For instance, if there is a default Combiner or default combining behaviour that collapses multiple records with identical keys and values into a single record, I'd like to avoid that.  Instead of blindly collapsing identical records, I'd want to aggregate their values and emit the aggregate.

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



Re: Default Combiner or default combining behaviour?

Posted by Arun C Murthy <ar...@yahoo-inc.com>.
On Mar 20, 2008, at 3:56 PM, Otis Gospodnetic wrote:

> Hi,
>
> The MapReduce tutorial mentions Combiners only in passing.  Is  
> there a default Combiner or default combining behaviour?
>

No, there is *no* default combiner at all. It has to be explicitly  
set in the JobConf to take effect.

Arun

> Concretely, I want to make sure that records are not getting  
> combined behind the scenes in some way without me seeing it, and  
> causing me to lose data.  For instance, if there is a default  
> Combiner or default combining behaviour that collapses multiple  
> records with identical keys and values into a single record, I'd  
> like to avoid that.  Instead of blindly collapsing identical  
> records, I'd want to aggregate their values and emit the aggregate.
>
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>