You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Bhaskar Ghosh <bj...@yahoo.co.in> on 2010/09/19 12:22:07 UTC

How to create a composite value object for output from Map method

Hi All,

What would be the right approach to solve this problem:
	1. I need to output an object as the value from my map method. The object's 
class should have two mambers: an ArrayList<String> and another, an integer.

I used following two ways, but they are not working:
	* I wrote a class MyCompositeValueWritable that implements Writable interface.  
Inside the overridden readFields and write methods, I try to read/write using 
the ObjectWritable class. [see attached file MyWordCount_ObjVal1_2.java]

	* The custom class is a plain class 'MyCompositeValue' not implementing or 
inheriting anything. The Map and Reduce methods try to output the <key, 
value=<object of MyCompositeValue> > using the ObjectWritable class. [see 
attached file Case2.txt] 

	* Am I going wrong somewhere? Appreciate any help.

	2. I have another problem, in which I need two types of mappers and reducer, 
and I want to execute them in this order:
	* Mapper1 -> Reducer1 -> Mapper2 -> Reducer2
	* Is it possible through ChainMapper and/or ChainReducer classes? It yes, then 
how? Can anybody provide some starting working example, or point me to some good 
url for the same?
	* Currently, I am doing it as a work-around:
	* The first set of Mapper-Reducer write to HDFS. Then the second set of 
Mapper-Reducer pick up that output file from HDFS and writes further processed 
output to another HDFS directory.
	* An example would be really really  helpful.
Thanks
Bhaskar Ghosh

"Ignorance is Bliss... Knowledge never brings Peace!!!"


Re: How to create a composite value object for output from Map method

Posted by Bhaskar Ghosh <bj...@yahoo.co.in>.
Chris / All,

Any idea why this is error-ing out?



________________________________
From: Bhaskar Ghosh <bj...@yahoo.co.in>
To: mapreduce-user@hadoop.apache.org
Sent: Tue, 21 September, 2010 12:08:21 AM
Subject: Re: How to create a composite value object for output from Map method


Hello All,

Thanks Chris for your suggestion and time.

I tried as you said. Now it is giving me runtime NullPointerException

java.lang.NullPointerException
>at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)
>at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>at  
>MyWordCount_ObjVal1_2$MyCompositeValueWritable.readFields(MyWordCount_ObjVal1_2.java:99)
>
>at 
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>
>at 
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>
>at 
>org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>at org.apache.hadoop.mapred.Child.main(Child.java:170)

I just tried the map method where it tries to write the MyCompositeValueWritable 
object as the value. I have commented out the reducer class. But still the error 
has come inside MyCompositeValueWritable.readFields().

job.setMapperClass(TokenizerMapper.class);
>//commented out
>//    job.setCombinerClass(IntSumReducer.class);
>//     job.setReducerClass(IntSumReducer.class);

My map method looks like this:

public void map(Object key, Text value, Context context ) throws IOException, 
InterruptedException {
>      StringTokenizer itr = new StringTokenizer(value.toString());
>      while (itr.hasMoreTokens()) {
>        word.set(itr.nextToken());
>        MyCompositeValueWritable compositeValue = new 
>MyCompositeValueWritable();
>        compositeValue.addToList(localname);
>        compositeValue.setValue(1);
>        context.write (word, compositeValue);
>}
My readFields and write methods' source is below:


public void readFields(DataInput in) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class, 
>listOfString.toArray(new String[]{""}));
>list.readFields(in);
>value =  in.readInt();
>}
>
>
>public void write(DataOutput out) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class, 
>listOfString.toArray(new String[]{""}));
>list.write(out);
>out.write(value);
>}
 
Has anybody faced similar issue? Appreciate any help.
Thanks
Bhaskar Ghosh
Hyderabad, India

http://www.google.com/profiles/bjgindia

"Ignorance is Bliss... Knowledge never brings Peace!!!"




________________________________
From: "Christopher.Shain@sungard.com" <Ch...@sungard.com>
To: mapreduce-user@hadoop.apache.org
Sent: Sun, 19 September, 2010 9:12:20 PM
Subject: RE: How to create a composite value object for output from Map method

 
I think your first approach at serializing is correct, except for the use of 
ObjectWritable.  From the docs, ObjectWritable only handles Strings, Arrays, and 
primitives.  You are trying to use it to serialize your ArrayList.  Try 
converting the ArrayList to an array of Strings first.
 
As for the second problem, I’d have a look at Cascading
 
Hope these help…
 
Chris
 
From:Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in] 
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method
 
Hi All,
 
What would be the right approach to solve this problem:
	1. I need to      output an object as the value from my map method. The 
object's class      should have two mambers: an ArrayList<String> and another, 
an      integer.

I used following two ways, but they are not working:
	* I      wrote a class MyCompositeValueWritable that implements Writable      
interface. 
	* Inside      the overridden readFields and write methods, I try to read/write 
using the      ObjectWritable class.
	* [see      attached file MyWordCount_ObjVal1_2.java]


	*  
	* The      custom class is a plain class 'MyCompositeValue' not implementing or      
inheriting anything.
	* The      Map and Reduce methods try to output the <key, value=<object of      
MyCompositeValue> > using the ObjectWritable class.
	* [see      attached file Case2.txt]
	*  
	*  
	* Am I going wrong somewhere? Appreciate any help.
 
	1. I have      another problem, in which I need two types of mappers and 
reducer, and I      want to execute them in this order:
	* Mapper1        -> Reducer1 -> Mapper2 -> Reducer2
	* Is it        possible through ChainMapper and/or ChainReducer classes? It 
yes, then        how? Can anybody provide some starting working example, or 
point me to        some good url for the same?
	* Currently,        I am doing it as a work-around:
	* The first         set of Mapper-Reducer write to HDFS. Then the second set of         
Mapper-Reducer pick up that output file from HDFS and writes further         
processed output to another HDFS directory.
	* An example        would be really really helpful.
 
Thanks
Bhaskar Ghosh

"Ignorance is Bliss... Knowledge never brings Peace!!!"


Re: How to create a composite value object for output from Map method

Posted by Bhaskar Ghosh <bj...@yahoo.co.in>.
Hello All,

Thanks Chris for your suggestion and time.

I tried as you said. Now it is giving me runtime NullPointerException

java.lang.NullPointerException
>at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)
>at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>at 
>MyWordCount_ObjVal1_2$MyCompositeValueWritable.readFields(MyWordCount_ObjVal1_2.java:99)
>
>at 
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>
>at 
>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>
>at 
>org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>at org.apache.hadoop.mapred.Child.main(Child.java:170)

I just tried the map method where it tries to write the MyCompositeValueWritable 
object as the value. I have commented out the reducer class. But still the error 
has come inside MyCompositeValueWritable.readFields().

job.setMapperClass(TokenizerMapper.class);
>//commented out
>//    job.setCombinerClass(IntSumReducer.class);
>//    job.setReducerClass(IntSumReducer.class);

My map method looks like this:

public void map(Object key, Text value, Context context ) throws IOException, 
InterruptedException {
>      StringTokenizer itr = new StringTokenizer(value.toString());
>      while (itr.hasMoreTokens()) {
>        word.set(itr.nextToken());
>        MyCompositeValueWritable compositeValue = new 
>MyCompositeValueWritable();
>        compositeValue.addToList(localname);
>        compositeValue.setValue(1);
>        context.write (word, compositeValue);
>}
My readFields and write methods' source is below:


public void readFields(DataInput in) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class, 
>listOfString.toArray(new String[]{""}));
>list.readFields(in);
>value = in.readInt();
>}
>
>
>public void write(DataOutput out) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class, 
>listOfString.toArray(new String[]{""}));
>list.write(out);
>out.write(value);
>}
 
Has anybody faced similar issue? Appreciate any help.
Thanks
Bhaskar Ghosh
Hyderabad, India

http://www.google.com/profiles/bjgindia

"Ignorance is Bliss... Knowledge never brings Peace!!!"




________________________________
From: "Christopher.Shain@sungard.com" <Ch...@sungard.com>
To: mapreduce-user@hadoop.apache.org
Sent: Sun, 19 September, 2010 9:12:20 PM
Subject: RE: How to create a composite value object for output from Map method

 
I think your first approach at serializing is correct, except for the use of 
ObjectWritable.  From the docs, ObjectWritable only handles Strings, Arrays, and 
primitives.  You are trying to use it to serialize your ArrayList.  Try 
converting the ArrayList to an array of Strings first.
 
As for the second problem, I’d have a look at Cascading
 
Hope these help…
 
Chris
 
From:Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in] 
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method
 
Hi All,
 
What would be the right approach to solve this problem:
	1. I need to      output an object as the value from my map method. The 
object's class      should have two mambers: an ArrayList<String> and another, 
an      integer.

I used following two ways, but they are not working:
	* I      wrote a class MyCompositeValueWritable that implements Writable      
interface. 
	* Inside      the overridden readFields and write methods, I try to read/write 
using the      ObjectWritable class.
	* [see      attached file MyWordCount_ObjVal1_2.java]


	*  
	* The      custom class is a plain class 'MyCompositeValue' not implementing or      
inheriting anything.
	* The      Map and Reduce methods try to output the <key, value=<object of      
MyCompositeValue> > using the ObjectWritable class.
	* [see      attached file Case2.txt]
	*  
	*  
	* Am I going wrong somewhere? Appreciate any help.
 
	1. I have      another problem, in which I need two types of mappers and 
reducer, and I      want to execute them in this order:
	* Mapper1        -> Reducer1 -> Mapper2 -> Reducer2
	* Is it        possible through ChainMapper and/or ChainReducer classes? It 
yes, then        how? Can anybody provide some starting working example, or 
point me to        some good url for the same?
	* Currently,        I am doing it as a work-around:
	* The first         set of Mapper-Reducer write to HDFS. Then the second set of         
Mapper-Reducer pick up that output file from HDFS and writes further         
processed output to another HDFS directory.
	* An example        would be really really helpful.
 
Thanks
Bhaskar Ghosh

"Ignorance is Bliss... Knowledge never brings Peace!!!"


RE: How to create a composite value object for output from Map method

Posted by Ch...@sungard.com.
I think your first approach at serializing is correct, except for the use of ObjectWritable.  From the docs, ObjectWritable only handles Strings, Arrays, and primitives.  You are trying to use it to serialize your ArrayList.  Try converting the ArrayList to an array of Strings first.

 

As for the second problem, I’d have a look at Cascading <http://www.cascading.org/> 

 

Hope these help…

 

Chris

 

From: Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in] 
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method

 

Hi All,

 

What would be the right approach to solve this problem:

1.	I need to output an object as the value from my map method. The object's class should have two mambers: an ArrayList<String> and another, an integer.
	
	I used following two ways, but they are not working:

*	I wrote a class MyCompositeValueWritable that implements Writable interface. 
*	Inside the overridden readFields and write methods, I try to read/write using the ObjectWritable class.
*	[see attached file MyWordCount_ObjVal1_2.java]
	
	
*	 
*	The custom class is a plain class 'MyCompositeValue' not implementing or inheriting anything.
*	The Map and Reduce methods try to output the <key, value=<object of MyCompositeValue> > using the ObjectWritable class.
*	[see attached file Case2.txt]
*	 
*	 
*	Am I going wrong somewhere? Appreciate any help.

 

2.	I have another problem, in which I need two types of mappers and reducer, and I want to execute them in this order:

		*	Mapper1 -> Reducer1 -> Mapper2 -> Reducer2
		*	Is it possible through ChainMapper and/or ChainReducer classes? It yes, then how? Can anybody provide some starting working example, or point me to some good url for the same?
		*	Currently, I am doing it as a work-around:

			*	The first set of Mapper-Reducer write to HDFS. Then the second set of Mapper-Reducer pick up that output file from HDFS and writes further processed output to another HDFS directory.

		*	An example would be really really helpful.

 

Thanks
Bhaskar Ghosh

"Ignorance is Bliss... Knowledge never brings Peace!!!"