You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Daniel Yehdego <dt...@miners.utep.edu> on 2011/09/20 07:43:25 UTC

Reducer to concatenate string values

Good evening, 
I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  

Regards, 

Daniel T. Yehdego
Computational Science Program 
University of Texas at El Paso, UTEP 
dtyehdego@miners.utep.edu 		 	   		  

RE: Reducer to concatenate string values

Posted by Daniel Yehdego <dt...@miners.utep.edu>.
Hi Kai, 
Many thanks for your response. I will look at the links you sent me and I will be back to you. 

Regards, 

Daniel T. Yehdego
Computational Science Program 
University of Texas at El Paso, UTEP 
dtyehdego@miners.utep.edu

> Subject: Re: Reducer to concatenate string values
> From: k@123.org
> Date: Tue, 20 Sep 2011 07:53:56 +0200
> To: common-user@hadoop.apache.org
> 
> Hi Daniel,
> 
> the values for a single key will be passed to reduce() in a non-predictable order. Actually, when running the same job on the same data again, the order is most likely different every time.
> 
> If you want the values to be in a sorted way, you need to apply a 'secondary sort'. The basic idea is to attach your values to the key, and then benefit from the sorting Hadoop does on the key.
> 
> However, you need to write some code to make that happen. Josh wrote a nice series of articles on it, and you will find more if you google for "secondary sort".
> 
> http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
> http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
> http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/
> 
> Kai
> 
> Am 20.09.2011 um 07:43 schrieb Daniel Yehdego:
> 
> > 
> > Good evening, 
> > I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  
> > 
> > Regards, 
> > 
> > Daniel T. Yehdego
> > Computational Science Program 
> > University of Texas at El Paso, UTEP 
> > dtyehdego@miners.utep.edu 		 	   		  
> 
> -- 
> Kai Voigt
> k@123.org
> 
> 
> 
> 
 		 	   		  

Re: Reducer to concatenate string values

Posted by Kai Voigt <k...@123.org>.
Hi Daniel,

the values for a single key will be passed to reduce() in a non-predictable order. Actually, when running the same job on the same data again, the order is most likely different every time.

If you want the values to be in a sorted way, you need to apply a 'secondary sort'. The basic idea is to attach your values to the key, and then benefit from the sorting Hadoop does on the key.

However, you need to write some code to make that happen. Josh wrote a nice series of articles on it, and you will find more if you google for "secondary sort".

http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

Kai

Am 20.09.2011 um 07:43 schrieb Daniel Yehdego:

> 
> Good evening, 
> I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  
> 
> Regards, 
> 
> Daniel T. Yehdego
> Computational Science Program 
> University of Texas at El Paso, UTEP 
> dtyehdego@miners.utep.edu 		 	   		  

-- 
Kai Voigt
k@123.org





Re: Reducer to concatenate string values

Posted by Ayon Sinha <ay...@yahoo.com>.
Hi Daniel,
There are ways to do what you are asking for, but are you sure you are using the right framework for the right problem? Hadoop's premise is that Reducer values are not sorted. If you want all values to be concatenated in a sorted order one way is to read in all values in the reducer (memory limitation) and sort the values and concatenate them. This solution is obviously not scalable and defeats the purpose of Hadoop's parallelism.  

The other way is outlined here http://sonerbalkir.blogspot.com/2010/01/simulating-secondary-sort-on-values.html where you put the value as part of the mapper output key and then override the partitioner to make all the key-values with the same prefix (not the entire key) to go to the same reducer. So you are essentially using hadoop to sort the keys and then sending them to the same reducer. 
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
From: Daniel Yehdego <dt...@miners.utep.edu>
To: common-user@hadoop.apache.org
Sent: Tuesday, September 20, 2011 8:38 AM
Subject: RE: Reducer to concatenate string values


Hi Ayon, 
any idea on my previous question?

Regards, 

Daniel T. Yehdego
Computational Science Program 
University of Texas at El Paso, UTEP 
dtyehdego@miners.utep.edu

> From: dtyehdego@miners.utep.edu
> To: common-user@hadoop.apache.org
> Subject: RE: Reducer to concatenate string values
> Date: Tue, 20 Sep 2011 06:06:22 +0000
> 
> 
> Hi Ayon, 
> I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order. 
> 
> Regards, 
> 
> Daniel T. Yehdego
> Computational Science Program 
> University of Texas at El Paso, UTEP 
> dtyehdego@miners.utep.edu
> 
> > Date: Mon, 19 Sep 2011 22:54:46 -0700
> > From: ayonsinha@yahoo.com
> > Subject: Re: Reducer to concatenate string values
> > To: common-user@hadoop.apache.org
> > 
> > What are you using for your map/reduce? Streaming/Java/Pig/Hive?
> >  
> > -Ayon
> > See My Photos on Flickr
> > Also check out my Blog for answers to commonly asked questions.
> > 
> > 
> > 
> > ________________________________
> > From: Daniel Yehdego <dt...@miners.utep.edu>
> > To: common-user@hadoop.apache.org
> > Sent: Monday, September 19, 2011 10:43 PM
> > Subject: Reducer to concatenate string values
> > 
> > 
> > Good evening, 
> > I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  
> > 
> > Regards, 
> > 
> > Daniel T. Yehdego
> > Computational Science Program 
> > University of Texas at El Paso, UTEP 
> > dtyehdego@miners.utep.edu                          
>                            

Re: Reducer to concatenate string values

Posted by Shi Yu <sh...@uchicago.edu>.
Hi,

You probably need to use secondary sort (based on TextPair key) and  
string concatenation function (like StringBuffer) to do this.   I once 
had a talk on Open Cloud Science workshop about this (also see my 
previous post in this newsgroup)

Best,

Shi

On 9/20/2011 10:38 AM, Daniel Yehdego wrote:
> Hi Ayon,
> any idea on my previous question?
>
> Regards,
>
> Daniel T. Yehdego
> Computational Science Program
> University of Texas at El Paso, UTEP
> dtyehdego@miners.utep.edu
>
>> From: dtyehdego@miners.utep.edu
>> To: common-user@hadoop.apache.org
>> Subject: RE: Reducer to concatenate string values
>> Date: Tue, 20 Sep 2011 06:06:22 +0000
>>
>>
>> Hi Ayon,
>> I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order.
>>
>> Regards,
>>
>> Daniel T. Yehdego
>> Computational Science Program
>> University of Texas at El Paso, UTEP
>> dtyehdego@miners.utep.edu
>>
>>> Date: Mon, 19 Sep 2011 22:54:46 -0700
>>> From: ayonsinha@yahoo.com
>>> Subject: Re: Reducer to concatenate string values
>>> To: common-user@hadoop.apache.org
>>>
>>> What are you using for your map/reduce? Streaming/Java/Pig/Hive?
>>>
>>> -Ayon
>>> See My Photos on Flickr
>>> Also check out my Blog for answers to commonly asked questions.
>>>
>>>
>>>
>>> ________________________________
>>> From: Daniel Yehdego<dt...@miners.utep.edu>
>>> To: common-user@hadoop.apache.org
>>> Sent: Monday, September 19, 2011 10:43 PM
>>> Subject: Reducer to concatenate string values
>>>
>>>
>>> Good evening,
>>> I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
>>>
>>> Regards,
>>>
>>> Daniel T. Yehdego
>>> Computational Science Program
>>> University of Texas at El Paso, UTEP
>>> dtyehdego@miners.utep.edu
>>   		 	   		
>   		 	   		



RE: Reducer to concatenate string values

Posted by Daniel Yehdego <dt...@miners.utep.edu>.
Hi Ayon, 
any idea on my previous question?

Regards, 

Daniel T. Yehdego
Computational Science Program 
University of Texas at El Paso, UTEP 
dtyehdego@miners.utep.edu

> From: dtyehdego@miners.utep.edu
> To: common-user@hadoop.apache.org
> Subject: RE: Reducer to concatenate string values
> Date: Tue, 20 Sep 2011 06:06:22 +0000
> 
> 
> Hi Ayon, 
> I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order. 
> 
> Regards, 
> 
> Daniel T. Yehdego
> Computational Science Program 
> University of Texas at El Paso, UTEP 
> dtyehdego@miners.utep.edu
> 
> > Date: Mon, 19 Sep 2011 22:54:46 -0700
> > From: ayonsinha@yahoo.com
> > Subject: Re: Reducer to concatenate string values
> > To: common-user@hadoop.apache.org
> > 
> > What are you using for your map/reduce? Streaming/Java/Pig/Hive?
> >  
> > -Ayon
> > See My Photos on Flickr
> > Also check out my Blog for answers to commonly asked questions.
> > 
> > 
> > 
> > ________________________________
> > From: Daniel Yehdego <dt...@miners.utep.edu>
> > To: common-user@hadoop.apache.org
> > Sent: Monday, September 19, 2011 10:43 PM
> > Subject: Reducer to concatenate string values
> > 
> > 
> > Good evening, 
> > I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  
> > 
> > Regards, 
> > 
> > Daniel T. Yehdego
> > Computational Science Program 
> > University of Texas at El Paso, UTEP 
> > dtyehdego@miners.utep.edu                           
>  		 	   		  
 		 	   		  

RE: Reducer to concatenate string values

Posted by Daniel Yehdego <dt...@miners.utep.edu>.
Hi Ayon, 
I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order. 

Regards, 

Daniel T. Yehdego
Computational Science Program 
University of Texas at El Paso, UTEP 
dtyehdego@miners.utep.edu

> Date: Mon, 19 Sep 2011 22:54:46 -0700
> From: ayonsinha@yahoo.com
> Subject: Re: Reducer to concatenate string values
> To: common-user@hadoop.apache.org
> 
> What are you using for your map/reduce? Streaming/Java/Pig/Hive?
>  
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
> 
> 
> 
> ________________________________
> From: Daniel Yehdego <dt...@miners.utep.edu>
> To: common-user@hadoop.apache.org
> Sent: Monday, September 19, 2011 10:43 PM
> Subject: Reducer to concatenate string values
> 
> 
> Good evening, 
> I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  
> 
> Regards, 
> 
> Daniel T. Yehdego
> Computational Science Program 
> University of Texas at El Paso, UTEP 
> dtyehdego@miners.utep.edu                           
 		 	   		  

Re: Reducer to concatenate string values

Posted by Ayon Sinha <ay...@yahoo.com>.
What are you using for your map/reduce? Streaming/Java/Pig/Hive?
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
From: Daniel Yehdego <dt...@miners.utep.edu>
To: common-user@hadoop.apache.org
Sent: Monday, September 19, 2011 10:43 PM
Subject: Reducer to concatenate string values


Good evening, 
I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.  

Regards, 

Daniel T. Yehdego
Computational Science Program 
University of Texas at El Paso, UTEP 
dtyehdego@miners.utep.edu