You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sandhya E <sa...@gmail.com> on 2009/01/08 13:13:30 UTC

OutputCollector to print only keys

Hi

Does Hadoop 0.18 have an outputcollector that can print out only the
keys and supress the values. In our case we need only the keys, and
hence we do output.collect(key, blank) and then reprocess the entire
output file to remove trailing tabs.

Can this be done more cleanly through the hadoop classes itself.

Many Thanks in Advance
Sandhya

Re: OutputCollector to print only keys

Posted by Owen O'Malley <om...@apache.org>.
On Jan 8, 2009, at 4:13 AM, Sandhya E wrote:

> Does Hadoop 0.18 have an outputcollector that can print out only the
> keys and supress the values. In our case we need only the keys, and
> hence we do output.collect(key, blank) and then reprocess the entire
> output file to remove trailing tabs.

If you are using TextOutputFormat, you can either emit NullWritables  
or null. So do:

output.collect(key, null);

and it should work fine.

-- Owen

Re: OutputCollector to print only keys

Posted by Owen O'Malley <om...@apache.org>.
On Jan 8, 2009, at 4:13 AM, Sandhya E wrote:

> Does Hadoop 0.18 have an outputcollector that can print out only the
> keys and supress the values. In our case we need only the keys, and
> hence we do output.collect(key, blank) and then reprocess the entire
> output file to remove trailing tabs.

If you are using TextOutputFormat, you can either emit NullWritables  
or null. So do:

output.collect(key, null);

and it should work fine.

-- Owen