You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Bhaskar Ghosh <bj...@yahoo.co.in> on 2010/09/19 12:22:07 UTC
How to create a composite value object for output from Map method
Hi All,
What would be the right approach to solve this problem:
1. I need to output an object as the value from my map method. The object's
class should have two mambers: an ArrayList<String> and another, an integer.
I used following two ways, but they are not working:
* I wrote a class MyCompositeValueWritable that implements Writable interface.
Inside the overridden readFields and write methods, I try to read/write using
the ObjectWritable class. [see attached file MyWordCount_ObjVal1_2.java]
* The custom class is a plain class 'MyCompositeValue' not implementing or
inheriting anything. The Map and Reduce methods try to output the <key,
value=<object of MyCompositeValue> > using the ObjectWritable class. [see
attached file Case2.txt]
* Am I going wrong somewhere? Appreciate any help.
2. I have another problem, in which I need two types of mappers and reducer,
and I want to execute them in this order:
* Mapper1 -> Reducer1 -> Mapper2 -> Reducer2
* Is it possible through ChainMapper and/or ChainReducer classes? It yes, then
how? Can anybody provide some starting working example, or point me to some good
url for the same?
* Currently, I am doing it as a work-around:
* The first set of Mapper-Reducer write to HDFS. Then the second set of
Mapper-Reducer pick up that output file from HDFS and writes further processed
output to another HDFS directory.
* An example would be really really helpful.
Thanks
Bhaskar Ghosh
"Ignorance is Bliss... Knowledge never brings Peace!!!"
Re: How to create a composite value object for output from Map method
Posted by Bhaskar Ghosh <bj...@yahoo.co.in>.
Chris / All,
Any idea why this is error-ing out?
________________________________
From: Bhaskar Ghosh <bj...@yahoo.co.in>
To: mapreduce-user@hadoop.apache.org
Sent: Tue, 21 September, 2010 12:08:21 AM
Subject: Re: How to create a composite value object for output from Map method
Hello All,
Thanks Chris for your suggestion and time.
I tried as you said. Now it is giving me runtime NullPointerException
java.lang.NullPointerException
>at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)
>at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>at
>MyWordCount_ObjVal1_2$MyCompositeValueWritable.readFields(MyWordCount_ObjVal1_2.java:99)
>
>at
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>
>at
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>
>at
>org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>at org.apache.hadoop.mapred.Child.main(Child.java:170)
I just tried the map method where it tries to write the MyCompositeValueWritable
object as the value. I have commented out the reducer class. But still the error
has come inside MyCompositeValueWritable.readFields().
job.setMapperClass(TokenizerMapper.class);
>//commented out
>// job.setCombinerClass(IntSumReducer.class);
>// job.setReducerClass(IntSumReducer.class);
My map method looks like this:
public void map(Object key, Text value, Context context ) throws IOException,
InterruptedException {
> StringTokenizer itr = new StringTokenizer(value.toString());
> while (itr.hasMoreTokens()) {
> word.set(itr.nextToken());
> MyCompositeValueWritable compositeValue = new
>MyCompositeValueWritable();
> compositeValue.addToList(localname);
> compositeValue.setValue(1);
> context.write (word, compositeValue);
>}
My readFields and write methods' source is below:
public void readFields(DataInput in) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class,
>listOfString.toArray(new String[]{""}));
>list.readFields(in);
>value = in.readInt();
>}
>
>
>public void write(DataOutput out) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class,
>listOfString.toArray(new String[]{""}));
>list.write(out);
>out.write(value);
>}
Has anybody faced similar issue? Appreciate any help.
Thanks
Bhaskar Ghosh
Hyderabad, India
http://www.google.com/profiles/bjgindia
"Ignorance is Bliss... Knowledge never brings Peace!!!"
________________________________
From: "Christopher.Shain@sungard.com" <Ch...@sungard.com>
To: mapreduce-user@hadoop.apache.org
Sent: Sun, 19 September, 2010 9:12:20 PM
Subject: RE: How to create a composite value object for output from Map method
I think your first approach at serializing is correct, except for the use of
ObjectWritable. From the docs, ObjectWritable only handles Strings, Arrays, and
primitives. You are trying to use it to serialize your ArrayList. Try
converting the ArrayList to an array of Strings first.
As for the second problem, I’d have a look at Cascading
Hope these help…
Chris
From:Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in]
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method
Hi All,
What would be the right approach to solve this problem:
1. I need to output an object as the value from my map method. The
object's class should have two mambers: an ArrayList<String> and another,
an integer.
I used following two ways, but they are not working:
* I wrote a class MyCompositeValueWritable that implements Writable
interface.
* Inside the overridden readFields and write methods, I try to read/write
using the ObjectWritable class.
* [see attached file MyWordCount_ObjVal1_2.java]
*
* The custom class is a plain class 'MyCompositeValue' not implementing or
inheriting anything.
* The Map and Reduce methods try to output the <key, value=<object of
MyCompositeValue> > using the ObjectWritable class.
* [see attached file Case2.txt]
*
*
* Am I going wrong somewhere? Appreciate any help.
1. I have another problem, in which I need two types of mappers and
reducer, and I want to execute them in this order:
* Mapper1 -> Reducer1 -> Mapper2 -> Reducer2
* Is it possible through ChainMapper and/or ChainReducer classes? It
yes, then how? Can anybody provide some starting working example, or
point me to some good url for the same?
* Currently, I am doing it as a work-around:
* The first set of Mapper-Reducer write to HDFS. Then the second set of
Mapper-Reducer pick up that output file from HDFS and writes further
processed output to another HDFS directory.
* An example would be really really helpful.
Thanks
Bhaskar Ghosh
"Ignorance is Bliss... Knowledge never brings Peace!!!"
Re: How to create a composite value object for output from Map method
Posted by Bhaskar Ghosh <bj...@yahoo.co.in>.
Hello All,
Thanks Chris for your suggestion and time.
I tried as you said. Now it is giving me runtime NullPointerException
java.lang.NullPointerException
>at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)
>at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>at
>MyWordCount_ObjVal1_2$MyCompositeValueWritable.readFields(MyWordCount_ObjVal1_2.java:99)
>
>at
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>
>at
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>
>at
>org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>at org.apache.hadoop.mapred.Child.main(Child.java:170)
I just tried the map method where it tries to write the MyCompositeValueWritable
object as the value. I have commented out the reducer class. But still the error
has come inside MyCompositeValueWritable.readFields().
job.setMapperClass(TokenizerMapper.class);
>//commented out
>// job.setCombinerClass(IntSumReducer.class);
>// job.setReducerClass(IntSumReducer.class);
My map method looks like this:
public void map(Object key, Text value, Context context ) throws IOException,
InterruptedException {
> StringTokenizer itr = new StringTokenizer(value.toString());
> while (itr.hasMoreTokens()) {
> word.set(itr.nextToken());
> MyCompositeValueWritable compositeValue = new
>MyCompositeValueWritable();
> compositeValue.addToList(localname);
> compositeValue.setValue(1);
> context.write (word, compositeValue);
>}
My readFields and write methods' source is below:
public void readFields(DataInput in) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class,
>listOfString.toArray(new String[]{""}));
>list.readFields(in);
>value = in.readInt();
>}
>
>
>public void write(DataOutput out) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class,
>listOfString.toArray(new String[]{""}));
>list.write(out);
>out.write(value);
>}
Has anybody faced similar issue? Appreciate any help.
Thanks
Bhaskar Ghosh
Hyderabad, India
http://www.google.com/profiles/bjgindia
"Ignorance is Bliss... Knowledge never brings Peace!!!"
________________________________
From: "Christopher.Shain@sungard.com" <Ch...@sungard.com>
To: mapreduce-user@hadoop.apache.org
Sent: Sun, 19 September, 2010 9:12:20 PM
Subject: RE: How to create a composite value object for output from Map method
I think your first approach at serializing is correct, except for the use of
ObjectWritable. From the docs, ObjectWritable only handles Strings, Arrays, and
primitives. You are trying to use it to serialize your ArrayList. Try
converting the ArrayList to an array of Strings first.
As for the second problem, I’d have a look at Cascading
Hope these help…
Chris
From:Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in]
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method
Hi All,
What would be the right approach to solve this problem:
1. I need to output an object as the value from my map method. The
object's class should have two mambers: an ArrayList<String> and another,
an integer.
I used following two ways, but they are not working:
* I wrote a class MyCompositeValueWritable that implements Writable
interface.
* Inside the overridden readFields and write methods, I try to read/write
using the ObjectWritable class.
* [see attached file MyWordCount_ObjVal1_2.java]
*
* The custom class is a plain class 'MyCompositeValue' not implementing or
inheriting anything.
* The Map and Reduce methods try to output the <key, value=<object of
MyCompositeValue> > using the ObjectWritable class.
* [see attached file Case2.txt]
*
*
* Am I going wrong somewhere? Appreciate any help.
1. I have another problem, in which I need two types of mappers and
reducer, and I want to execute them in this order:
* Mapper1 -> Reducer1 -> Mapper2 -> Reducer2
* Is it possible through ChainMapper and/or ChainReducer classes? It
yes, then how? Can anybody provide some starting working example, or
point me to some good url for the same?
* Currently, I am doing it as a work-around:
* The first set of Mapper-Reducer write to HDFS. Then the second set of
Mapper-Reducer pick up that output file from HDFS and writes further
processed output to another HDFS directory.
* An example would be really really helpful.
Thanks
Bhaskar Ghosh
"Ignorance is Bliss... Knowledge never brings Peace!!!"
RE: How to create a composite value object for output from Map method
Posted by Ch...@sungard.com.
I think your first approach at serializing is correct, except for the use of ObjectWritable. From the docs, ObjectWritable only handles Strings, Arrays, and primitives. You are trying to use it to serialize your ArrayList. Try converting the ArrayList to an array of Strings first.
As for the second problem, I’d have a look at Cascading <http://www.cascading.org/>
Hope these help…
Chris
From: Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in]
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method
Hi All,
What would be the right approach to solve this problem:
1. I need to output an object as the value from my map method. The object's class should have two mambers: an ArrayList<String> and another, an integer.
I used following two ways, but they are not working:
* I wrote a class MyCompositeValueWritable that implements Writable interface.
* Inside the overridden readFields and write methods, I try to read/write using the ObjectWritable class.
* [see attached file MyWordCount_ObjVal1_2.java]
*
* The custom class is a plain class 'MyCompositeValue' not implementing or inheriting anything.
* The Map and Reduce methods try to output the <key, value=<object of MyCompositeValue> > using the ObjectWritable class.
* [see attached file Case2.txt]
*
*
* Am I going wrong somewhere? Appreciate any help.
2. I have another problem, in which I need two types of mappers and reducer, and I want to execute them in this order:
* Mapper1 -> Reducer1 -> Mapper2 -> Reducer2
* Is it possible through ChainMapper and/or ChainReducer classes? It yes, then how? Can anybody provide some starting working example, or point me to some good url for the same?
* Currently, I am doing it as a work-around:
* The first set of Mapper-Reducer write to HDFS. Then the second set of Mapper-Reducer pick up that output file from HDFS and writes further processed output to another HDFS directory.
* An example would be really really helpful.
Thanks
Bhaskar Ghosh
"Ignorance is Bliss... Knowledge never brings Peace!!!"