You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Torsten Curdt <tc...@apache.org> on 2010/06/07 16:38:03 UTC

multiple outputs

I need to emit to different output files from a reducer.

The old API had MultipleSequenceFileOutputFormat.
Am I missing something or is this gone in the new API?

Are there any problems porting this over?
Or does it just needs to be done?

cheers
--
Torsten

Re: multiple outputs

Posted by Amareshwari Sri Ramadasu <am...@yahoo-inc.com>.
Yes. They can be used inside a mapper also.
See org.apache.hadoop.mapred.lib.TestMultipleOutputs.java or org.apache.hadoop.mapreduce.lib.output.TestMRMultiplteOutputs for some sample code.

Thanks
Amareshwari


On 6/9/10 5:57 AM, "Torsten Curdt" <tc...@vafer.org> wrote:

Can the MultipleOutputs also be used inside a mapper?

So basically I pipe data into different reducers from the mapper.

Of course I could do two separate jobs but that would very inefficient
as I would have to go/read through all the data twice.

cheers
--
Torsten

On Tue, Jun 8, 2010 at 06:22, Amareshwari Sri Ramadasu
<am...@yahoo-inc.com> wrote:
> MultipleOutputs is ported to use new api through
> http://issues.apache.org/jira/browse/MAPREDUCE-370
> See the discussions on jira and javadoc/testcase as an example on how to use
> it.
>
> Thanks
> Amareshwari
>
> On 6/7/10 8:08 PM, "Torsten Curdt" <tc...@apache.org> wrote:
>
> I need to emit to different output files from a reducer.
>
> The old API had MultipleSequenceFileOutputFormat.
> Am I missing something or is this gone in the new API?
>
> Are there any problems porting this over?
> Or does it just needs to be done?
>
> cheers
> --
> Torsten
>
>


Re: multiple outputs

Posted by Torsten Curdt <tc...@vafer.org>.
Can the MultipleOutputs also be used inside a mapper?

So basically I pipe data into different reducers from the mapper.

Of course I could do two separate jobs but that would very inefficient
as I would have to go/read through all the data twice.

cheers
--
Torsten

On Tue, Jun 8, 2010 at 06:22, Amareshwari Sri Ramadasu
<am...@yahoo-inc.com> wrote:
> MultipleOutputs is ported to use new api through
> http://issues.apache.org/jira/browse/MAPREDUCE-370
> See the discussions on jira and javadoc/testcase as an example on how to use
> it.
>
> Thanks
> Amareshwari
>
> On 6/7/10 8:08 PM, "Torsten Curdt" <tc...@apache.org> wrote:
>
> I need to emit to different output files from a reducer.
>
> The old API had MultipleSequenceFileOutputFormat.
> Am I missing something or is this gone in the new API?
>
> Are there any problems porting this over?
> Or does it just needs to be done?
>
> cheers
> --
> Torsten
>
>

Re: multiple outputs

Posted by Amareshwari Sri Ramadasu <am...@yahoo-inc.com>.
MultipleOutputs is ported to use new api through http://issues.apache.org/jira/browse/MAPREDUCE-370
See the discussions on jira and javadoc/testcase as an example on how to use it.

Thanks
Amareshwari

On 6/7/10 8:08 PM, "Torsten Curdt" <tc...@apache.org> wrote:

I need to emit to different output files from a reducer.

The old API had MultipleSequenceFileOutputFormat.
Am I missing something or is this gone in the new API?

Are there any problems porting this over?
Or does it just needs to be done?

cheers
--
Torsten