You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by M B <ma...@gmail.com> on 2010/04/16 00:04:02 UTC

HashMap type output from mapper

How can I have my mapper output a HashMap as the value in the
OutputCollector (so my reducer can work directly on the HashMap key/value
pairs)?  I tried just setting things up as HashMap in
conf.setOutputValueClass(HashMap.class), but that didn't work.  What do I
need to change to allow another type (hashmap or arraylist) to be the output
of the map method?

      public static class Map extends MapReduceBase implements
Mapper<LongWritable, Text, Text, HashMap> {
...
        public void map(LongWritable key, Text value, OutputCollector<Text,
HashMap> output, Reporter reporter) throws IOException {
...
          output.collect(new Text("hello"), myHash);
...

Re: HashMap type output from mapper

Posted by M B <ma...@gmail.com>.
Tom,
MapWritable looks promising.  Would there be a straightfoward way to convert
from a HashMap to this object?  I'm not sure what method I'd need to use for
that.

Also, how exactly do I reference the value of a single key in a MapWritable
object?  I must be doing something wrong as I tried referencing it just like
I would with a HashMap, but I get a null pointer exception.

Here is what I had:
output.collect(new Text(myMW.get("mkey")), ...)
where myMW is a MapWritable and I assigned a key/value pair of
"mkey"="dummyval" using the put method.

thanks
On Thu, Apr 15, 2010 at 3:23 PM, Tom White <to...@cloudera.com> wrote:

> Have a look at org.apache.hadoop.io.MapWritable, which is a Map for
> storing Writable keys and values.
>
> Cheers,
> Tom
>
> On Thu, Apr 15, 2010 at 3:17 PM, Eric Sammer <es...@cloudera.com> wrote:
> > You need to implement a custom Writable (the serialization interface
> > supported by Hadoop). If you want to use your own custom types as
> > keys, they must implement WritableComparable. You could implement a
> > "box" custom Writable to hold HashMap or any other type, but you'd
> > have to find a way of encoding the type you want to send over the wire
> > (probably as a byte array). You could use Java serialization to turn
> > the HashMap into a byte buffer and then box that up in a BytesWritable
> > (which holds byte[]). It will probably be slow, though.
> >
> > On Thu, Apr 15, 2010 at 3:04 PM, M B <ma...@gmail.com> wrote:
> >> How can I have my mapper output a HashMap as the value in the
> >> OutputCollector (so my reducer can work directly on the HashMap
> key/value
> >> pairs)?  I tried just setting things up as HashMap in
> >> conf.setOutputValueClass(HashMap.class), but that didn't work.  What do
> I
> >> need to change to allow another type (hashmap or arraylist) to be the
> output
> >> of the map method?
> >>
> >>      public static class Map extends MapReduceBase implements
> >> Mapper<LongWritable, Text, Text, HashMap> {
> >> ...
> >>        public void map(LongWritable key, Text value,
> OutputCollector<Text,
> >> HashMap> output, Reporter reporter) throws IOException {
> >> ...
> >>          output.collect(new Text("hello"), myHash);
> >> ...
> >>
> >
> >
> >
> > --
> > Eric Sammer
> > phone: +1-917-287-2675
> > twitter: esammer
> > data: www.cloudera.com
> >
>

Re: HashMap type output from mapper

Posted by Tom White <to...@cloudera.com>.
Have a look at org.apache.hadoop.io.MapWritable, which is a Map for
storing Writable keys and values.

Cheers,
Tom

On Thu, Apr 15, 2010 at 3:17 PM, Eric Sammer <es...@cloudera.com> wrote:
> You need to implement a custom Writable (the serialization interface
> supported by Hadoop). If you want to use your own custom types as
> keys, they must implement WritableComparable. You could implement a
> "box" custom Writable to hold HashMap or any other type, but you'd
> have to find a way of encoding the type you want to send over the wire
> (probably as a byte array). You could use Java serialization to turn
> the HashMap into a byte buffer and then box that up in a BytesWritable
> (which holds byte[]). It will probably be slow, though.
>
> On Thu, Apr 15, 2010 at 3:04 PM, M B <ma...@gmail.com> wrote:
>> How can I have my mapper output a HashMap as the value in the
>> OutputCollector (so my reducer can work directly on the HashMap key/value
>> pairs)?  I tried just setting things up as HashMap in
>> conf.setOutputValueClass(HashMap.class), but that didn't work.  What do I
>> need to change to allow another type (hashmap or arraylist) to be the output
>> of the map method?
>>
>>      public static class Map extends MapReduceBase implements
>> Mapper<LongWritable, Text, Text, HashMap> {
>> ...
>>        public void map(LongWritable key, Text value, OutputCollector<Text,
>> HashMap> output, Reporter reporter) throws IOException {
>> ...
>>          output.collect(new Text("hello"), myHash);
>> ...
>>
>
>
>
> --
> Eric Sammer
> phone: +1-917-287-2675
> twitter: esammer
> data: www.cloudera.com
>

Re: HashMap type output from mapper

Posted by Eric Sammer <es...@cloudera.com>.
You need to implement a custom Writable (the serialization interface
supported by Hadoop). If you want to use your own custom types as
keys, they must implement WritableComparable. You could implement a
"box" custom Writable to hold HashMap or any other type, but you'd
have to find a way of encoding the type you want to send over the wire
(probably as a byte array). You could use Java serialization to turn
the HashMap into a byte buffer and then box that up in a BytesWritable
(which holds byte[]). It will probably be slow, though.

On Thu, Apr 15, 2010 at 3:04 PM, M B <ma...@gmail.com> wrote:
> How can I have my mapper output a HashMap as the value in the
> OutputCollector (so my reducer can work directly on the HashMap key/value
> pairs)?  I tried just setting things up as HashMap in
> conf.setOutputValueClass(HashMap.class), but that didn't work.  What do I
> need to change to allow another type (hashmap or arraylist) to be the output
> of the map method?
>
>      public static class Map extends MapReduceBase implements
> Mapper<LongWritable, Text, Text, HashMap> {
> ...
>        public void map(LongWritable key, Text value, OutputCollector<Text,
> HashMap> output, Reporter reporter) throws IOException {
> ...
>          output.collect(new Text("hello"), myHash);
> ...
>



-- 
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com