You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Derek Shaw <de...@rogers.com> on 2008/05/07 05:26:30 UTC
Collecting output not to file
Hey,
>From the examples that I have seen thus far, all of the results from the reduce function are being written to a file. Instead of writing results to a file, I want to store them and inspect them after the job is completed. (I think that I need to implement my own OutputCollector, but I don't know how to tell hadoop to use it.) How can I do this?
-Derek
Re: Collecting output not to file
Posted by Igor Maximchuk <im...@masterhost.ru>.
(I think that I need to implement my own OutputCollector, but I don't
know how to tell hadoop to use it.) How can I do this?
> -Derek
>
You probably need to define your own OutputFormat and tell Hadoop to use
it by calling setOutputFormat method of JobConf.
OutputFormat instance is used to create RecordWriter instance which is
used by OutputCollector to process output data.
You may want to take a look at implementation of
SequenceFileOutputFormat for example
Re: Collecting output not to file
Posted by Derek Shaw <de...@rogers.com>.
Good point.
I want to put the results of the reduce function in a multimap instead of writing them to a file.
-Derek
Amar Kamat <am...@yahoo-inc.com> wrote: Derek Shaw wrote:
> Hey,
>
> From the examples that I have seen thus far, all of the results from the reduce function are being written to a file. Instead of writing results to a file, I want to store them
What do you mean by "store and inspect"?
> and inspect them after the job is completed. (I think that I need to implement my own OutputCollector, but I don't know how to tell hadoop to use it.) How can I do this?
>
> -Derek
>
>
Re: Collecting output not to file
Posted by Amar Kamat <am...@yahoo-inc.com>.
Derek Shaw wrote:
> Hey,
>
> From the examples that I have seen thus far, all of the results from the reduce function are being written to a file. Instead of writing results to a file, I want to store them
What do you mean by "store and inspect"?
> and inspect them after the job is completed. (I think that I need to implement my own OutputCollector, but I don't know how to tell hadoop to use it.) How can I do this?
>
> -Derek
>
>