You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Shirley Cohen <sc...@cs.utexas.edu> on 2008/09/09 21:20:21 UTC

output multiple values?

I have a simple reducer that computes the average by doing a sum/ 
count. But I want to output both the average and the count for a  
given key, not just the average. Is it possible to output both values  
from the same invocation of the reducer? Or do I need two reducer  
invocations? If I try to call output.collect() twice from the reducer  
and label the key with "type=avg" or "type=count", I get a bunch of  
garbage out. Please let me know if you have any suggestions.

Thanks,

Shirley

Re: output multiple values?

Posted by Shirley Cohen <sh...@cis.upenn.edu>.
Thanks Owen! I found the bug in my code: Doing collect twice does  
work now :))

Shirley

On Sep 9, 2008, at 4:19 PM, Owen O'Malley wrote:

>
> On Sep 9, 2008, at 12:20 PM, Shirley Cohen wrote:
>
>> I have a simple reducer that computes the average by doing a sum/ 
>> count. But I want to output both the average and the count for a  
>> given key, not just the average. Is it possible to output both  
>> values from the same invocation of the reducer? Or do I need two  
>> reducer invocations? If I try to call output.collect() twice from  
>> the reducer and label the key with "type=avg" or "type=count", I  
>> get a bunch of garbage out. Please let me know if you have any  
>> suggestions.
>
> I'd be tempted to define a type like:
>
> class AverageAndCount implements Writable {
>   private long sum;
>   private long count;
>   ...
>   public String toString() {
>      return "avg = " + (sum / (double) count) + ", count = " + count);
>   }
> }
>
> Then you could use your reducer as both a combiner and reducer and  
> you would get both values out if you use TextOutputFormat. That  
> said, it should absolutely work to do collect twice.
>
> -- Owen


Re: output multiple values?

Posted by Owen O'Malley <om...@apache.org>.
On Sep 9, 2008, at 12:20 PM, Shirley Cohen wrote:

> I have a simple reducer that computes the average by doing a sum/ 
> count. But I want to output both the average and the count for a  
> given key, not just the average. Is it possible to output both  
> values from the same invocation of the reducer? Or do I need two  
> reducer invocations? If I try to call output.collect() twice from  
> the reducer and label the key with "type=avg" or "type=count", I get  
> a bunch of garbage out. Please let me know if you have any  
> suggestions.

I'd be tempted to define a type like:

class AverageAndCount implements Writable {
   private long sum;
   private long count;
   ...
   public String toString() {
      return "avg = " + (sum / (double) count) + ", count = " + count);
   }
}

Then you could use your reducer as both a combiner and reducer and you  
would get both values out if you use TextOutputFormat. That said, it  
should absolutely work to do collect twice.

-- Owen