You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Mike Smith <mi...@gmail.com> on 2007/09/21 00:52:35 UTC

Last key in map or reduce

Hi,

In order to have the last key, is it ok to keep a copy of OuputCollector and
Reporter always in the reducer class and use them in close()? Or at tham
time the output parts already are closed? Is there any trick to find out the
last key in mapper or reducer?

Thanks,
Mike

Re: Last key in map or reduce

Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Sep 20, 2007, at 3:52 PM, Mike Smith wrote:

> In order to have the last key, is it ok to keep a copy of  
> OuputCollector and
> Reporter always in the reducer class and use them in close()? Or at  
> tham
> time the output parts already are closed? Is there any trick to  
> find out the
> last key in mapper or reducer?

It is fine to use the collector until the close method returns. The  
close method is intended for that purpose.

In fact, because we need to support streaming and pipes, the  
constraints on the collector are very loose. You can emit key/value  
pairs to the collector even between calls to map or reduce. (ie. If  
your mapper launches a thread it can output records even when the  
Mapper's map method is not being called.)

-- Owen