You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Gopal Gandhi <go...@yahoo.com> on 2008/07/21 23:35:45 UTC
[Streaming] I figured out a way to do combining using mapper, would anybody check it?
I am using Hadoop Streaming.
I figured out a way to do combining using mapper, is it the same as using a separate combiner?
For example: the input is a list of words, I want to count their total number for each word.
The traditional mapper is:
while (<STDIN>) {
chomp ($_);
$word = $_;
print ($word\t1\n);
}
........
Instead of using a additional combiner, I modify the mapper to use a hash
%hash = ();
while (<STDIN>) {
chomp ($_);
$word = $_;
$hash{$word} ++;
}
foreach $key (%hash){
print "$key\t$hash{$key}\n";
}
Is it the same as using a seperate combiner?