You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Derek Shaw <de...@rogers.com> on 2008/05/07 05:26:30 UTC

Collecting output not to file

Hey,

>From the examples that I have seen thus far, all of the results from the reduce function are being written to a file. Instead of writing results to a file, I want to store them and inspect them after the job is completed. (I think that I need to implement my own OutputCollector, but I don't know how to tell hadoop to use it.) How can I do this?

-Derek

Re: Collecting output not to file

Posted by Igor Maximchuk <im...@masterhost.ru>.

(I think that I need to implement my own OutputCollector, but I don't 
know how to tell hadoop to use it.) How can I do this?
> -Derek
>   
You probably need to define your own OutputFormat and tell Hadoop to use 
it by calling setOutputFormat method of JobConf.
OutputFormat  instance is used to create RecordWriter instance which is 
used by OutputCollector to process output data.
You may want to take a look at implementation of 
SequenceFileOutputFormat for example

Re: Collecting output not to file

Posted by Derek Shaw <de...@rogers.com>.

Good point.

I want to put the results of the reduce function in a multimap instead of writing them to a file.

-Derek

Amar Kamat <am...@yahoo-inc.com> wrote: Derek Shaw wrote:
> Hey,
>
> From the examples that I have seen thus far, all of the results from the reduce function are being written to a file. Instead of writing results to a file, I want to store them
What do you mean by "store and inspect"?
>  and inspect them after the job is completed. (I think that I need to implement my own OutputCollector, but I don't know how to tell hadoop to use it.) How can I do this?
>
> -Derek
>
>

Re: Collecting output not to file

Posted by Amar Kamat <am...@yahoo-inc.com>.

Derek Shaw wrote:
> Hey,
>
> From the examples that I have seen thus far, all of the results from the reduce function are being written to a file. Instead of writing results to a file, I want to store them
What do you mean by "store and inspect"?
>  and inspect them after the job is completed. (I think that I need to implement my own OutputCollector, but I don't know how to tell hadoop to use it.) How can I do this?
>
> -Derek
>
>