You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Amandeep Khurana <am...@gmail.com> on 2009/03/27 10:46:07 UTC

Multiple k,v pairs from a single map - possible?

Is it possible to output multiple key value pairs from a single map function
run?

For example, the mapper outputing <name,phone> and <name, address>
simultaneously...

Can I write multiple output.collect(...) commands?

Amandeep

Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Re: Multipleoutput file

Posted by 皮皮 <pi...@gmail.com>.
i do it in another method:

    MultipleOutputs.addNamedOutput(job, "delete",
SequenceFileOutputFormat.class, Text.class, IntWritable.class);
    MultipleOutputs.addNamedOutput(job, "compare",
SequenceFileOutputFormat.class, LongWritable.class, IndexDoc.class);


    Path toDelete = null,  toCompare = null;
    FileStatus[] fstats = fs.listStatus(outDir1);
    for( FileStatus file : fstats){
        if( file.getPath().getName().startsWith("delete"))
            toDelete = file.getPath();

        else if( file.getPath().getName().startsWith("compare"))
            toCompare = file.getPath();
    }

 i don't know if it is regular to do this, but it is solvable for me right
now.

2009/5/22 皮皮 <pi...@gmail.com>

> thank you for you reply, jason.
>
> well , how should i do if i just want to get certain file in the directory
> , not all of the files?
>
> 2009/5/21 jason hadoop <ja...@gmail.com>
>
> setInputPaths will take an array, or variable arguments.
>> or you can simply provide the directory that the individual files reside
>> in,
>> and the individual files will be added.
>>
>> If there are other files in the directory, you may need to specify a
>> custom
>> input path filter via FileInputFormat.setInputPathFilter.
>>
>>
>> 2009/5/21 皮皮 <pi...@gmail.com>
>>
>> > yes , but how can i get the commaSeperatedPaths? As i can't specify it
>> > handy.
>> >
>> > it's not practicable to do that:
>> >
>> > commaSeperatedPaths_1 = "MAPPINGOUTPUT-r-00001";
>> > commaSeperatedPaths_2 = "MAPPINGOUTPUT-r-00002";
>> >
>> > FileInputFormat.setInputPaths(job, commaSeperatedPaths_1);
>> > FileInputFormat.setInputPaths(job, commaSeperatedPaths_2);
>> >
>> >
>> >
>> > 2009/4/7 Brian MacKay <Br...@medecision.com>
>> >
>> > >
>> > > Not sure about your question:  seems like you'd like to do this...?
>> > >
>> > > After you run job, your output may be like MAPPINGOUTPUT-r-00001,
>> > > MAPPINGOUTPUT-r-00002, etc.
>> > >
>> > > You'd need to set them as multiple inputs.
>> > >
>> > > FileInputFormat.setInputPaths(job, commaSeperatedPaths);
>> > >
>> > >
>> > > Brian
>> > >
>> > > -----Original Message-----
>> > > From: 皮皮 [mailto:pi.bingfeng@gmail.com]
>> > > Sent: Tuesday, April 07, 2009 3:30 AM
>> > > To: core-user@hadoop.apache.org
>> > > Subject: Re: Multiple k,v pairs from a single map - possible?
>> > >
>> > > could any body tell me how to get one of the multipleoutput file in
>> > another
>> > > jobconfig?
>> > >
>> > > 2009/4/3 皮皮 <pi...@gmail.com>
>> > >
>> > > > thank you very much . this is what i am looking for.
>> > > >
>> > > > 2009/3/27 Brian MacKay <Br...@medecision.com>
>> > > >
>> > > >
>> > > >> Amandeep,
>> > > >>
>> > > >> Add this to your driver.....
>> > > >>
>> > > >> MultipleOutputs.addNamedOutput(conf,
>> "PHONE",TextOutputFormat.class,
>> > > >> Text.class, Text.class);
>> > > >>
>> > > >> MultipleOutputs.addNamedOutput(conf, "NAME,
>> > > >>                    TextOutputFormat.class, Text.class, Text.class);
>> > > >>
>> > > >>
>> > > >>
>> > > >> And in your reducer....
>> > > >>
>> > > >>  private MultipleOutputs mos;
>> > > >>
>> > > >> public void reduce(Text key, Iterator<Text> values,
>> > > >>            OutputCollector<Text, Text> output, Reporter reporter) {
>> > > >>
>> > > >>
>> > > >>          // namedOutPut = either PHONE or NAME
>> > > >>
>> > > >>        while (values.hasNext()) {
>> > > >>            String value = values.next().toString();
>> > > >>            mos.getCollector(namedOutPut, reporter).collect(
>> > > >>                    new Text(value), new Text(othervals));
>> > > >>        }
>> > > >>    }
>> > > >>
>> > > >>    @Override
>> > > >>    public void configure(JobConf conf) {
>> > > >>        super.configure(conf);
>> > > >>        mos = new MultipleOutputs(conf);
>> > > >>    }
>> > > >>
>> > > >>    public void close() throws IOException {
>> > > >>        mos.close();
>> > > >>    }
>> > > >>
>> > > >>
>> > > >>
>> > > >> By the way, have you had a change to post your Oracle fix to
>> > > >> DBInputFormat ?
>> > > >> If so, what is the Jira tag #?
>> > > >>
>> > > >> Brian
>> > > >>
>> > > >> -----Original Message-----
>> > > >> From: Amandeep Khurana [mailto:amansk@gmail.com]
>> > > >> Sent: Friday, March 27, 2009 5:46 AM
>> > > >> To: core-user@hadoop.apache.org
>> > > >> Subject: Multiple k,v pairs from a single map - possible?
>> > > >>
>> > > >> Is it possible to output multiple key value pairs from a single map
>> > > >> function
>> > > >> run?
>> > > >>
>> > > >> For example, the mapper outputing <name,phone> and <name, address>
>> > > >> simultaneously...
>> > > >>
>> > > >> Can I write multiple output.collect(...) commands?
>> > > >>
>> > > >> Amandeep
>> > > >>
>> > > >> Amandeep Khurana
>> > > >> Computer Science Graduate Student
>> > > >> University of California, Santa Cruz
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _
>> > _
>> > > _
>> > > >> _
>> > > >>
>> > > >> The information transmitted is intended only for the person or
>> entity
>> > to
>> > > >> which it is addressed and may contain confidential and/or
>> privileged
>> > > >> material. Any review, retransmission, dissemination or other use
>> of,
>> > or
>> > > >> taking of any action in reliance upon, this information by persons
>> or
>> > > >> entities other than the intended recipient is prohibited. If you
>> > > received
>> > > >> this message in error, please contact the sender and delete the
>> > material
>> > > >> from any computer.
>> > > >>
>> > > >>
>> > > >>
>> > > >
>> > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _ _
>> > _
>> > >
>> > > The information transmitted is intended only for the person or entity
>> to
>> > > which it is addressed and may contain confidential and/or privileged
>> > > material. Any review, retransmission, dissemination or other use of,
>> or
>> > > taking of any action in reliance upon, this information by persons or
>> > > entities other than the intended recipient is prohibited. If you
>> received
>> > > this message in error, please contact the sender and delete the
>> material
>> > > from any computer.
>> > >
>> > >
>> >
>>
>>
>>
>> --
>> Alpha Chapters of my book on Hadoop are available
>> http://www.apress.com/book/view/9781430219422
>> www.prohadoopbook.com a community for Hadoop Professionals
>>
>
>

Re: Multipleoutput file

Posted by 皮皮 <pi...@gmail.com>.
thank you for you reply, jason.

well , how should i do if i just want to get certain file in the directory ,
not all of the files?

2009/5/21 jason hadoop <ja...@gmail.com>

> setInputPaths will take an array, or variable arguments.
> or you can simply provide the directory that the individual files reside
> in,
> and the individual files will be added.
>
> If there are other files in the directory, you may need to specify a custom
> input path filter via FileInputFormat.setInputPathFilter.
>
>
> 2009/5/21 皮皮 <pi...@gmail.com>
>
> > yes , but how can i get the commaSeperatedPaths? As i can't specify it
> > handy.
> >
> > it's not practicable to do that:
> >
> > commaSeperatedPaths_1 = "MAPPINGOUTPUT-r-00001";
> > commaSeperatedPaths_2 = "MAPPINGOUTPUT-r-00002";
> >
> > FileInputFormat.setInputPaths(job, commaSeperatedPaths_1);
> > FileInputFormat.setInputPaths(job, commaSeperatedPaths_2);
> >
> >
> >
> > 2009/4/7 Brian MacKay <Br...@medecision.com>
> >
> > >
> > > Not sure about your question:  seems like you'd like to do this...?
> > >
> > > After you run job, your output may be like MAPPINGOUTPUT-r-00001,
> > > MAPPINGOUTPUT-r-00002, etc.
> > >
> > > You'd need to set them as multiple inputs.
> > >
> > > FileInputFormat.setInputPaths(job, commaSeperatedPaths);
> > >
> > >
> > > Brian
> > >
> > > -----Original Message-----
> > > From: 皮皮 [mailto:pi.bingfeng@gmail.com]
> > > Sent: Tuesday, April 07, 2009 3:30 AM
> > > To: core-user@hadoop.apache.org
> > > Subject: Re: Multiple k,v pairs from a single map - possible?
> > >
> > > could any body tell me how to get one of the multipleoutput file in
> > another
> > > jobconfig?
> > >
> > > 2009/4/3 皮皮 <pi...@gmail.com>
> > >
> > > > thank you very much . this is what i am looking for.
> > > >
> > > > 2009/3/27 Brian MacKay <Br...@medecision.com>
> > > >
> > > >
> > > >> Amandeep,
> > > >>
> > > >> Add this to your driver.....
> > > >>
> > > >> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
> > > >> Text.class, Text.class);
> > > >>
> > > >> MultipleOutputs.addNamedOutput(conf, "NAME,
> > > >>                    TextOutputFormat.class, Text.class, Text.class);
> > > >>
> > > >>
> > > >>
> > > >> And in your reducer....
> > > >>
> > > >>  private MultipleOutputs mos;
> > > >>
> > > >> public void reduce(Text key, Iterator<Text> values,
> > > >>            OutputCollector<Text, Text> output, Reporter reporter) {
> > > >>
> > > >>
> > > >>          // namedOutPut = either PHONE or NAME
> > > >>
> > > >>        while (values.hasNext()) {
> > > >>            String value = values.next().toString();
> > > >>            mos.getCollector(namedOutPut, reporter).collect(
> > > >>                    new Text(value), new Text(othervals));
> > > >>        }
> > > >>    }
> > > >>
> > > >>    @Override
> > > >>    public void configure(JobConf conf) {
> > > >>        super.configure(conf);
> > > >>        mos = new MultipleOutputs(conf);
> > > >>    }
> > > >>
> > > >>    public void close() throws IOException {
> > > >>        mos.close();
> > > >>    }
> > > >>
> > > >>
> > > >>
> > > >> By the way, have you had a change to post your Oracle fix to
> > > >> DBInputFormat ?
> > > >> If so, what is the Jira tag #?
> > > >>
> > > >> Brian
> > > >>
> > > >> -----Original Message-----
> > > >> From: Amandeep Khurana [mailto:amansk@gmail.com]
> > > >> Sent: Friday, March 27, 2009 5:46 AM
> > > >> To: core-user@hadoop.apache.org
> > > >> Subject: Multiple k,v pairs from a single map - possible?
> > > >>
> > > >> Is it possible to output multiple key value pairs from a single map
> > > >> function
> > > >> run?
> > > >>
> > > >> For example, the mapper outputing <name,phone> and <name, address>
> > > >> simultaneously...
> > > >>
> > > >> Can I write multiple output.collect(...) commands?
> > > >>
> > > >> Amandeep
> > > >>
> > > >> Amandeep Khurana
> > > >> Computer Science Graduate Student
> > > >> University of California, Santa Cruz
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> > _
> > > _
> > > >> _
> > > >>
> > > >> The information transmitted is intended only for the person or
> entity
> > to
> > > >> which it is addressed and may contain confidential and/or privileged
> > > >> material. Any review, retransmission, dissemination or other use of,
> > or
> > > >> taking of any action in reliance upon, this information by persons
> or
> > > >> entities other than the intended recipient is prohibited. If you
> > > received
> > > >> this message in error, please contact the sender and delete the
> > material
> > > >> from any computer.
> > > >>
> > > >>
> > > >>
> > > >
> > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> > _
> > >
> > > The information transmitted is intended only for the person or entity
> to
> > > which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipient is prohibited. If you
> received
> > > this message in error, please contact the sender and delete the
> material
> > > from any computer.
> > >
> > >
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: Multipleoutput file

Posted by jason hadoop <ja...@gmail.com>.
setInputPaths will take an array, or variable arguments.
or you can simply provide the directory that the individual files reside in,
and the individual files will be added.

If there are other files in the directory, you may need to specify a custom
input path filter via FileInputFormat.setInputPathFilter.


2009/5/21 皮皮 <pi...@gmail.com>

> yes , but how can i get the commaSeperatedPaths? As i can't specify it
> handy.
>
> it's not practicable to do that:
>
> commaSeperatedPaths_1 = "MAPPINGOUTPUT-r-00001";
> commaSeperatedPaths_2 = "MAPPINGOUTPUT-r-00002";
>
> FileInputFormat.setInputPaths(job, commaSeperatedPaths_1);
> FileInputFormat.setInputPaths(job, commaSeperatedPaths_2);
>
>
>
> 2009/4/7 Brian MacKay <Br...@medecision.com>
>
> >
> > Not sure about your question:  seems like you'd like to do this...?
> >
> > After you run job, your output may be like MAPPINGOUTPUT-r-00001,
> > MAPPINGOUTPUT-r-00002, etc.
> >
> > You'd need to set them as multiple inputs.
> >
> > FileInputFormat.setInputPaths(job, commaSeperatedPaths);
> >
> >
> > Brian
> >
> > -----Original Message-----
> > From: 皮皮 [mailto:pi.bingfeng@gmail.com]
> > Sent: Tuesday, April 07, 2009 3:30 AM
> > To: core-user@hadoop.apache.org
> > Subject: Re: Multiple k,v pairs from a single map - possible?
> >
> > could any body tell me how to get one of the multipleoutput file in
> another
> > jobconfig?
> >
> > 2009/4/3 皮皮 <pi...@gmail.com>
> >
> > > thank you very much . this is what i am looking for.
> > >
> > > 2009/3/27 Brian MacKay <Br...@medecision.com>
> > >
> > >
> > >> Amandeep,
> > >>
> > >> Add this to your driver.....
> > >>
> > >> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
> > >> Text.class, Text.class);
> > >>
> > >> MultipleOutputs.addNamedOutput(conf, "NAME,
> > >>                    TextOutputFormat.class, Text.class, Text.class);
> > >>
> > >>
> > >>
> > >> And in your reducer....
> > >>
> > >>  private MultipleOutputs mos;
> > >>
> > >> public void reduce(Text key, Iterator<Text> values,
> > >>            OutputCollector<Text, Text> output, Reporter reporter) {
> > >>
> > >>
> > >>          // namedOutPut = either PHONE or NAME
> > >>
> > >>        while (values.hasNext()) {
> > >>            String value = values.next().toString();
> > >>            mos.getCollector(namedOutPut, reporter).collect(
> > >>                    new Text(value), new Text(othervals));
> > >>        }
> > >>    }
> > >>
> > >>    @Override
> > >>    public void configure(JobConf conf) {
> > >>        super.configure(conf);
> > >>        mos = new MultipleOutputs(conf);
> > >>    }
> > >>
> > >>    public void close() throws IOException {
> > >>        mos.close();
> > >>    }
> > >>
> > >>
> > >>
> > >> By the way, have you had a change to post your Oracle fix to
> > >> DBInputFormat ?
> > >> If so, what is the Jira tag #?
> > >>
> > >> Brian
> > >>
> > >> -----Original Message-----
> > >> From: Amandeep Khurana [mailto:amansk@gmail.com]
> > >> Sent: Friday, March 27, 2009 5:46 AM
> > >> To: core-user@hadoop.apache.org
> > >> Subject: Multiple k,v pairs from a single map - possible?
> > >>
> > >> Is it possible to output multiple key value pairs from a single map
> > >> function
> > >> run?
> > >>
> > >> For example, the mapper outputing <name,phone> and <name, address>
> > >> simultaneously...
> > >>
> > >> Can I write multiple output.collect(...) commands?
> > >>
> > >> Amandeep
> > >>
> > >> Amandeep Khurana
> > >> Computer Science Graduate Student
> > >> University of California, Santa Cruz
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> > _
> > >> _
> > >>
> > >> The information transmitted is intended only for the person or entity
> to
> > >> which it is addressed and may contain confidential and/or privileged
> > >> material. Any review, retransmission, dissemination or other use of,
> or
> > >> taking of any action in reliance upon, this information by persons or
> > >> entities other than the intended recipient is prohibited. If you
> > received
> > >> this message in error, please contact the sender and delete the
> material
> > >> from any computer.
> > >>
> > >>
> > >>
> > >
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> >
> > The information transmitted is intended only for the person or entity to
> > which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipient is prohibited. If you received
> > this message in error, please contact the sender and delete the material
> > from any computer.
> >
> >
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: Multipleoutput file

Posted by 皮皮 <pi...@gmail.com>.
yes , but how can i get the commaSeperatedPaths? As i can't specify it
handy.

it's not practicable to do that:

commaSeperatedPaths_1 = "MAPPINGOUTPUT-r-00001";
commaSeperatedPaths_2 = "MAPPINGOUTPUT-r-00002";

FileInputFormat.setInputPaths(job, commaSeperatedPaths_1);
FileInputFormat.setInputPaths(job, commaSeperatedPaths_2);



2009/4/7 Brian MacKay <Br...@medecision.com>

>
> Not sure about your question:  seems like you'd like to do this...?
>
> After you run job, your output may be like MAPPINGOUTPUT-r-00001,
> MAPPINGOUTPUT-r-00002, etc.
>
> You'd need to set them as multiple inputs.
>
> FileInputFormat.setInputPaths(job, commaSeperatedPaths);
>
>
> Brian
>
> -----Original Message-----
> From: 皮皮 [mailto:pi.bingfeng@gmail.com]
> Sent: Tuesday, April 07, 2009 3:30 AM
> To: core-user@hadoop.apache.org
> Subject: Re: Multiple k,v pairs from a single map - possible?
>
> could any body tell me how to get one of the multipleoutput file in another
> jobconfig?
>
> 2009/4/3 皮皮 <pi...@gmail.com>
>
> > thank you very much . this is what i am looking for.
> >
> > 2009/3/27 Brian MacKay <Br...@medecision.com>
> >
> >
> >> Amandeep,
> >>
> >> Add this to your driver.....
> >>
> >> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
> >> Text.class, Text.class);
> >>
> >> MultipleOutputs.addNamedOutput(conf, "NAME,
> >>                    TextOutputFormat.class, Text.class, Text.class);
> >>
> >>
> >>
> >> And in your reducer....
> >>
> >>  private MultipleOutputs mos;
> >>
> >> public void reduce(Text key, Iterator<Text> values,
> >>            OutputCollector<Text, Text> output, Reporter reporter) {
> >>
> >>
> >>          // namedOutPut = either PHONE or NAME
> >>
> >>        while (values.hasNext()) {
> >>            String value = values.next().toString();
> >>            mos.getCollector(namedOutPut, reporter).collect(
> >>                    new Text(value), new Text(othervals));
> >>        }
> >>    }
> >>
> >>    @Override
> >>    public void configure(JobConf conf) {
> >>        super.configure(conf);
> >>        mos = new MultipleOutputs(conf);
> >>    }
> >>
> >>    public void close() throws IOException {
> >>        mos.close();
> >>    }
> >>
> >>
> >>
> >> By the way, have you had a change to post your Oracle fix to
> >> DBInputFormat ?
> >> If so, what is the Jira tag #?
> >>
> >> Brian
> >>
> >> -----Original Message-----
> >> From: Amandeep Khurana [mailto:amansk@gmail.com]
> >> Sent: Friday, March 27, 2009 5:46 AM
> >> To: core-user@hadoop.apache.org
> >> Subject: Multiple k,v pairs from a single map - possible?
> >>
> >> Is it possible to output multiple key value pairs from a single map
> >> function
> >> run?
> >>
> >> For example, the mapper outputing <name,phone> and <name, address>
> >> simultaneously...
> >>
> >> Can I write multiple output.collect(...) commands?
> >>
> >> Amandeep
> >>
> >> Amandeep Khurana
> >> Computer Science Graduate Student
> >> University of California, Santa Cruz
> >>
> >>
> >>
> >>
> >>
> >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> >> _
> >>
> >> The information transmitted is intended only for the person or entity to
> >> which it is addressed and may contain confidential and/or privileged
> >> material. Any review, retransmission, dissemination or other use of, or
> >> taking of any action in reliance upon, this information by persons or
> >> entities other than the intended recipient is prohibited. If you
> received
> >> this message in error, please contact the sender and delete the material
> >> from any computer.
> >>
> >>
> >>
> >
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipient is prohibited. If you received
> this message in error, please contact the sender and delete the material
> from any computer.
>
>

Multipleoutput file

Posted by Brian MacKay <Br...@MEDecision.com>.
Not sure about your question:  seems like you'd like to do this...?

After you run job, your output may be like MAPPINGOUTPUT-r-00001, MAPPINGOUTPUT-r-00002, etc.

You'd need to set them as multiple inputs.

FileInputFormat.setInputPaths(job, commaSeperatedPaths);


Brian

-----Original Message-----
From: 皮皮 [mailto:pi.bingfeng@gmail.com] 
Sent: Tuesday, April 07, 2009 3:30 AM
To: core-user@hadoop.apache.org
Subject: Re: Multiple k,v pairs from a single map - possible?

could any body tell me how to get one of the multipleoutput file in another
jobconfig?

2009/4/3 皮皮 <pi...@gmail.com>

> thank you very much . this is what i am looking for.
>
> 2009/3/27 Brian MacKay <Br...@medecision.com>
>
>
>> Amandeep,
>>
>> Add this to your driver.....
>>
>> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
>> Text.class, Text.class);
>>
>> MultipleOutputs.addNamedOutput(conf, "NAME,
>>                    TextOutputFormat.class, Text.class, Text.class);
>>
>>
>>
>> And in your reducer....
>>
>>  private MultipleOutputs mos;
>>
>> public void reduce(Text key, Iterator<Text> values,
>>            OutputCollector<Text, Text> output, Reporter reporter) {
>>
>>
>>          // namedOutPut = either PHONE or NAME
>>
>>        while (values.hasNext()) {
>>            String value = values.next().toString();
>>            mos.getCollector(namedOutPut, reporter).collect(
>>                    new Text(value), new Text(othervals));
>>        }
>>    }
>>
>>    @Override
>>    public void configure(JobConf conf) {
>>        super.configure(conf);
>>        mos = new MultipleOutputs(conf);
>>    }
>>
>>    public void close() throws IOException {
>>        mos.close();
>>    }
>>
>>
>>
>> By the way, have you had a change to post your Oracle fix to
>> DBInputFormat ?
>> If so, what is the Jira tag #?
>>
>> Brian
>>
>> -----Original Message-----
>> From: Amandeep Khurana [mailto:amansk@gmail.com]
>> Sent: Friday, March 27, 2009 5:46 AM
>> To: core-user@hadoop.apache.org
>> Subject: Multiple k,v pairs from a single map - possible?
>>
>> Is it possible to output multiple key value pairs from a single map
>> function
>> run?
>>
>> For example, the mapper outputing <name,phone> and <name, address>
>> simultaneously...
>>
>> Can I write multiple output.collect(...) commands?
>>
>> Amandeep
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>>
>>
>>
>>
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _
>>
>> The information transmitted is intended only for the person or entity to
>> which it is addressed and may contain confidential and/or privileged
>> material. Any review, retransmission, dissemination or other use of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipient is prohibited. If you received
>> this message in error, please contact the sender and delete the material
>> from any computer.
>>
>>
>>
>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this message in error, please contact the sender and delete the material 
from any computer.


Re: Multiple k,v pairs from a single map - possible?

Posted by 皮皮 <pi...@gmail.com>.
could any body tell me how to get one of the multipleoutput file in another
jobconfig?

2009/4/3 皮皮 <pi...@gmail.com>

> thank you very much . this is what i am looking for.
>
> 2009/3/27 Brian MacKay <Br...@medecision.com>
>
>
>> Amandeep,
>>
>> Add this to your driver.....
>>
>> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
>> Text.class, Text.class);
>>
>> MultipleOutputs.addNamedOutput(conf, "NAME,
>>                    TextOutputFormat.class, Text.class, Text.class);
>>
>>
>>
>> And in your reducer....
>>
>>  private MultipleOutputs mos;
>>
>> public void reduce(Text key, Iterator<Text> values,
>>            OutputCollector<Text, Text> output, Reporter reporter) {
>>
>>
>>          // namedOutPut = either PHONE or NAME
>>
>>        while (values.hasNext()) {
>>            String value = values.next().toString();
>>            mos.getCollector(namedOutPut, reporter).collect(
>>                    new Text(value), new Text(othervals));
>>        }
>>    }
>>
>>    @Override
>>    public void configure(JobConf conf) {
>>        super.configure(conf);
>>        mos = new MultipleOutputs(conf);
>>    }
>>
>>    public void close() throws IOException {
>>        mos.close();
>>    }
>>
>>
>>
>> By the way, have you had a change to post your Oracle fix to
>> DBInputFormat ?
>> If so, what is the Jira tag #?
>>
>> Brian
>>
>> -----Original Message-----
>> From: Amandeep Khurana [mailto:amansk@gmail.com]
>> Sent: Friday, March 27, 2009 5:46 AM
>> To: core-user@hadoop.apache.org
>> Subject: Multiple k,v pairs from a single map - possible?
>>
>> Is it possible to output multiple key value pairs from a single map
>> function
>> run?
>>
>> For example, the mapper outputing <name,phone> and <name, address>
>> simultaneously...
>>
>> Can I write multiple output.collect(...) commands?
>>
>> Amandeep
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>>
>>
>>
>>
>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>> _
>>
>> The information transmitted is intended only for the person or entity to
>> which it is addressed and may contain confidential and/or privileged
>> material. Any review, retransmission, dissemination or other use of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipient is prohibited. If you received
>> this message in error, please contact the sender and delete the material
>> from any computer.
>>
>>
>>
>

Re: Multiple k,v pairs from a single map - possible?

Posted by 皮皮 <pi...@gmail.com>.
thank you very much . this is what i am looking for.

2009/3/27 Brian MacKay <Br...@medecision.com>

>
> Amandeep,
>
> Add this to your driver.....
>
> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
> Text.class, Text.class);
>
> MultipleOutputs.addNamedOutput(conf, "NAME,
>                    TextOutputFormat.class, Text.class, Text.class);
>
>
>
> And in your reducer....
>
>  private MultipleOutputs mos;
>
> public void reduce(Text key, Iterator<Text> values,
>            OutputCollector<Text, Text> output, Reporter reporter) {
>
>
>          // namedOutPut = either PHONE or NAME
>
>        while (values.hasNext()) {
>            String value = values.next().toString();
>            mos.getCollector(namedOutPut, reporter).collect(
>                    new Text(value), new Text(othervals));
>        }
>    }
>
>    @Override
>    public void configure(JobConf conf) {
>        super.configure(conf);
>        mos = new MultipleOutputs(conf);
>    }
>
>    public void close() throws IOException {
>        mos.close();
>    }
>
>
>
> By the way, have you had a change to post your Oracle fix to
> DBInputFormat ?
> If so, what is the Jira tag #?
>
> Brian
>
> -----Original Message-----
> From: Amandeep Khurana [mailto:amansk@gmail.com]
> Sent: Friday, March 27, 2009 5:46 AM
> To: core-user@hadoop.apache.org
> Subject: Multiple k,v pairs from a single map - possible?
>
> Is it possible to output multiple key value pairs from a single map
> function
> run?
>
> For example, the mapper outputing <name,phone> and <name, address>
> simultaneously...
>
> Can I write multiple output.collect(...) commands?
>
> Amandeep
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
>
>
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipient is prohibited. If you received
> this message in error, please contact the sender and delete the material
> from any computer.
>
>
>

Re: Multiple k,v pairs from a single map - possible?

Posted by Amandeep Khurana <am...@gmail.com>.
Here's the JIRA for the Oracle fix.
https://issues.apache.org/jira/browse/HADOOP-5616

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Fri, Mar 27, 2009 at 5:18 AM, Brian MacKay
<Br...@medecision.com>wrote:

>
> Amandeep,
>
> Add this to your driver.....
>
> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
> Text.class, Text.class);
>
> MultipleOutputs.addNamedOutput(conf, "NAME,
>                    TextOutputFormat.class, Text.class, Text.class);
>
>
>
> And in your reducer....
>
>  private MultipleOutputs mos;
>
> public void reduce(Text key, Iterator<Text> values,
>            OutputCollector<Text, Text> output, Reporter reporter) {
>
>
>          // namedOutPut = either PHONE or NAME
>
>        while (values.hasNext()) {
>            String value = values.next().toString();
>            mos.getCollector(namedOutPut, reporter).collect(
>                    new Text(value), new Text(othervals));
>        }
>    }
>
>    @Override
>    public void configure(JobConf conf) {
>        super.configure(conf);
>        mos = new MultipleOutputs(conf);
>    }
>
>    public void close() throws IOException {
>        mos.close();
>    }
>
>
>
> By the way, have you had a change to post your Oracle fix to
> DBInputFormat ?
> If so, what is the Jira tag #?
>
> Brian
>
> -----Original Message-----
> From: Amandeep Khurana [mailto:amansk@gmail.com]
> Sent: Friday, March 27, 2009 5:46 AM
> To: core-user@hadoop.apache.org
> Subject: Multiple k,v pairs from a single map - possible?
>
> Is it possible to output multiple key value pairs from a single map
> function
> run?
>
> For example, the mapper outputing <name,phone> and <name, address>
> simultaneously...
>
> Can I write multiple output.collect(...) commands?
>
> Amandeep
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
>
>
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipient is prohibited. If you received
> this message in error, please contact the sender and delete the material
> from any computer.
>
>
>

RE: Multiple k,v pairs from a single map - possible?

Posted by Brian MacKay <Br...@MEDecision.com>.
Amandeep,

Add this to your driver.....

MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
Text.class, Text.class);

MultipleOutputs.addNamedOutput(conf, "NAME,
                    TextOutputFormat.class, Text.class, Text.class);



And in your reducer....

 private MultipleOutputs mos;

public void reduce(Text key, Iterator<Text> values,
            OutputCollector<Text, Text> output, Reporter reporter) {

		
          // namedOutPut = either PHONE or NAME 

        while (values.hasNext()) {
            String value = values.next().toString();
            mos.getCollector(namedOutPut, reporter).collect(
                    new Text(value), new Text(othervals));
        }
    }

    @Override
    public void configure(JobConf conf) {
        super.configure(conf);
        mos = new MultipleOutputs(conf);
    }

    public void close() throws IOException {
        mos.close();
    }



By the way, have you had a change to post your Oracle fix to
DBInputFormat ? 
If so, what is the Jira tag #?

Brian

-----Original Message-----
From: Amandeep Khurana [mailto:amansk@gmail.com] 
Sent: Friday, March 27, 2009 5:46 AM
To: core-user@hadoop.apache.org
Subject: Multiple k,v pairs from a single map - possible?

Is it possible to output multiple key value pairs from a single map
function
run?

For example, the mapper outputing <name,phone> and <name, address>
simultaneously...

Can I write multiple output.collect(...) commands?

Amandeep

Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz





_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this message in error, please contact the sender and delete the material 
from any computer.



Re: Multiple k,v pairs from a single map - possible?

Posted by jason hadoop <ja...@gmail.com>.
You may write an arbitrary number of output.collect command

You may even use MultipleOutputFormat, to separate and stream the
output.collect results to additional destinations.


Caution must be taken to ensure that large numbers of files are not created,
when using MultipleOutputFormat


On Fri, Mar 27, 2009 at 2:46 AM, Amandeep Khurana <am...@gmail.com> wrote:

> Is it possible to output multiple key value pairs from a single map
> function
> run?
>
> For example, the mapper outputing <name,phone> and <name, address>
> simultaneously...
>
> Can I write multiple output.collect(...) commands?
>
> Amandeep
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422