You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jason Yang <li...@gmail.com> on 2012/09/17 09:16:57 UTC

How to output according to the key in reducer?

Hi, all

I was wondering how to write all the input with the same key to a single
file in the reducer ?

say, I got some intermediate outputs from mappers like that:
key     value
--------------------
1   annie
2   Jason
1   andy
2   Joey
1   andrew
...

and I would like write all the intermediate outputs with key 1 to file
"ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".

how could I achieve that?

-- 
YANG, Lin

Re: How to output according to the key in reducer?

Posted by Jason Yang <li...@gmail.com>.
All right, thanks~

2012/9/17 feng lu <am...@gmail.com>

> Hi
> Maybe you can refer to
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html
>
> or
>
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html
>
> example like this
>
>   public static class GeneratorOutputFormat extends
>       MultipleSequenceFileOutputFormat<IntWritable,Text> {
>     // generate a filename based on the segnum stored for this text
>     protected String generateFileNameForKeyValue(IntWritable key, Text
> value,
>         String name) {
>       return "ID_" + key.get() + ".dat";
>     }
>
>   }
>
> On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:
>
>> Hi, all
>>
>> I was wondering how to write all the input with the same key to a single
>> file in the reducer ?
>>
>> say, I got some intermediate outputs from mappers like that:
>> key     value
>> --------------------
>> 1   annie
>> 2   Jason
>> 1   andy
>> 2   Joey
>> 1   andrew
>> ...
>>
>> and I would like write all the intermediate outputs with key 1 to file
>> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>>
>> how could I achieve that?
>>
>> --
>> YANG, Lin
>>
>>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>



-- 
YANG, Lin

Re: How to output according to the key in reducer?

Posted by Jason Yang <li...@gmail.com>.
All right, thanks~

2012/9/17 feng lu <am...@gmail.com>

> Hi
> Maybe you can refer to
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html
>
> or
>
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html
>
> example like this
>
>   public static class GeneratorOutputFormat extends
>       MultipleSequenceFileOutputFormat<IntWritable,Text> {
>     // generate a filename based on the segnum stored for this text
>     protected String generateFileNameForKeyValue(IntWritable key, Text
> value,
>         String name) {
>       return "ID_" + key.get() + ".dat";
>     }
>
>   }
>
> On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:
>
>> Hi, all
>>
>> I was wondering how to write all the input with the same key to a single
>> file in the reducer ?
>>
>> say, I got some intermediate outputs from mappers like that:
>> key     value
>> --------------------
>> 1   annie
>> 2   Jason
>> 1   andy
>> 2   Joey
>> 1   andrew
>> ...
>>
>> and I would like write all the intermediate outputs with key 1 to file
>> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>>
>> how could I achieve that?
>>
>> --
>> YANG, Lin
>>
>>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>



-- 
YANG, Lin

Re: How to output according to the key in reducer?

Posted by Jason Yang <li...@gmail.com>.
All right, thanks~

2012/9/17 feng lu <am...@gmail.com>

> Hi
> Maybe you can refer to
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html
>
> or
>
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html
>
> example like this
>
>   public static class GeneratorOutputFormat extends
>       MultipleSequenceFileOutputFormat<IntWritable,Text> {
>     // generate a filename based on the segnum stored for this text
>     protected String generateFileNameForKeyValue(IntWritable key, Text
> value,
>         String name) {
>       return "ID_" + key.get() + ".dat";
>     }
>
>   }
>
> On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:
>
>> Hi, all
>>
>> I was wondering how to write all the input with the same key to a single
>> file in the reducer ?
>>
>> say, I got some intermediate outputs from mappers like that:
>> key     value
>> --------------------
>> 1   annie
>> 2   Jason
>> 1   andy
>> 2   Joey
>> 1   andrew
>> ...
>>
>> and I would like write all the intermediate outputs with key 1 to file
>> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>>
>> how could I achieve that?
>>
>> --
>> YANG, Lin
>>
>>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>



-- 
YANG, Lin

Re: How to output according to the key in reducer?

Posted by Jason Yang <li...@gmail.com>.
All right, thanks~

2012/9/17 feng lu <am...@gmail.com>

> Hi
> Maybe you can refer to
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html
>
> or
>
>
> http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html
>
> example like this
>
>   public static class GeneratorOutputFormat extends
>       MultipleSequenceFileOutputFormat<IntWritable,Text> {
>     // generate a filename based on the segnum stored for this text
>     protected String generateFileNameForKeyValue(IntWritable key, Text
> value,
>         String name) {
>       return "ID_" + key.get() + ".dat";
>     }
>
>   }
>
> On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:
>
>> Hi, all
>>
>> I was wondering how to write all the input with the same key to a single
>> file in the reducer ?
>>
>> say, I got some intermediate outputs from mappers like that:
>> key     value
>> --------------------
>> 1   annie
>> 2   Jason
>> 1   andy
>> 2   Joey
>> 1   andrew
>> ...
>>
>> and I would like write all the intermediate outputs with key 1 to file
>> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>>
>> how could I achieve that?
>>
>> --
>> YANG, Lin
>>
>>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>



-- 
YANG, Lin

Re: How to output according to the key in reducer?

Posted by feng lu <am...@gmail.com>.
Hi
Maybe you can refer to
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html

or

http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html

example like this

  public static class GeneratorOutputFormat extends
      MultipleSequenceFileOutputFormat<IntWritable,Text> {
    // generate a filename based on the segnum stored for this text
    protected String generateFileNameForKeyValue(IntWritable key, Text
value,
        String name) {
      return "ID_" + key.get() + ".dat";
    }

  }

On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>


-- 
Don't Grow Old, Grow Up... :-)

Re: How to output according to the key in reducer?

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

Can you see if the MultipleOutputs class can work for you ?

(
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html
)

On Mon, Sep 17, 2012 at 12:46 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>

Re: How to output according to the key in reducer?

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

Can you see if the MultipleOutputs class can work for you ?

(
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html
)

On Mon, Sep 17, 2012 at 12:46 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>

Re: How to output according to the key in reducer?

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

Can you see if the MultipleOutputs class can work for you ?

(
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html
)

On Mon, Sep 17, 2012 at 12:46 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>

Re: How to output according to the key in reducer?

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

Can you see if the MultipleOutputs class can work for you ?

(
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html
)

On Mon, Sep 17, 2012 at 12:46 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>

Re: How to output according to the key in reducer?

Posted by feng lu <am...@gmail.com>.
Hi
Maybe you can refer to
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html

or

http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html

example like this

  public static class GeneratorOutputFormat extends
      MultipleSequenceFileOutputFormat<IntWritable,Text> {
    // generate a filename based on the segnum stored for this text
    protected String generateFileNameForKeyValue(IntWritable key, Text
value,
        String name) {
      return "ID_" + key.get() + ".dat";
    }

  }

On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>


-- 
Don't Grow Old, Grow Up... :-)

Re: How to output according to the key in reducer?

Posted by feng lu <am...@gmail.com>.
Hi
Maybe you can refer to
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html

or

http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html

example like this

  public static class GeneratorOutputFormat extends
      MultipleSequenceFileOutputFormat<IntWritable,Text> {
    // generate a filename based on the segnum stored for this text
    protected String generateFileNameForKeyValue(IntWritable key, Text
value,
        String name) {
      return "ID_" + key.get() + ".dat";
    }

  }

On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>


-- 
Don't Grow Old, Grow Up... :-)

Re: How to output according to the key in reducer?

Posted by feng lu <am...@gmail.com>.
Hi
Maybe you can refer to
http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleSequenceFileOutputFormat.html

or

http://hadoop.apache.org/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleTextOutputFormat.html

example like this

  public static class GeneratorOutputFormat extends
      MultipleSequenceFileOutputFormat<IntWritable,Text> {
    // generate a filename based on the segnum stored for this text
    protected String generateFileNameForKeyValue(IntWritable key, Text
value,
        String name) {
      return "ID_" + key.get() + ".dat";
    }

  }

On Mon, Sep 17, 2012 at 3:16 PM, Jason Yang <li...@gmail.com>wrote:

> Hi, all
>
> I was wondering how to write all the input with the same key to a single
> file in the reducer ?
>
> say, I got some intermediate outputs from mappers like that:
> key     value
> --------------------
> 1   annie
> 2   Jason
> 1   andy
> 2   Joey
> 1   andrew
> ...
>
> and I would like write all the intermediate outputs with key 1 to file
> "ID_1.dat", and all the intermediate outputs with key 2 to file "ID_2.data".
>
> how could I achieve that?
>
> --
> YANG, Lin
>
>


-- 
Don't Grow Old, Grow Up... :-)