You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Daniel Yehdego <dt...@miners.utep.edu> on 2011/09/20 07:43:25 UTC
Reducer to concatenate string values
Good evening,
I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehdego@miners.utep.edu
RE: Reducer to concatenate string values
Posted by Daniel Yehdego <dt...@miners.utep.edu>.
Hi Kai,
Many thanks for your response. I will look at the links you sent me and I will be back to you.
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehdego@miners.utep.edu
> Subject: Re: Reducer to concatenate string values
> From: k@123.org
> Date: Tue, 20 Sep 2011 07:53:56 +0200
> To: common-user@hadoop.apache.org
>
> Hi Daniel,
>
> the values for a single key will be passed to reduce() in a non-predictable order. Actually, when running the same job on the same data again, the order is most likely different every time.
>
> If you want the values to be in a sorted way, you need to apply a 'secondary sort'. The basic idea is to attach your values to the key, and then benefit from the sorting Hadoop does on the key.
>
> However, you need to write some code to make that happen. Josh wrote a nice series of articles on it, and you will find more if you google for "secondary sort".
>
> http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
> http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
> http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/
>
> Kai
>
> Am 20.09.2011 um 07:43 schrieb Daniel Yehdego:
>
> >
> > Good evening,
> > I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
> >
> > Regards,
> >
> > Daniel T. Yehdego
> > Computational Science Program
> > University of Texas at El Paso, UTEP
> > dtyehdego@miners.utep.edu
>
> --
> Kai Voigt
> k@123.org
>
>
>
>
Re: Reducer to concatenate string values
Posted by Kai Voigt <k...@123.org>.
Hi Daniel,
the values for a single key will be passed to reduce() in a non-predictable order. Actually, when running the same job on the same data again, the order is most likely different every time.
If you want the values to be in a sorted way, you need to apply a 'secondary sort'. The basic idea is to attach your values to the key, and then benefit from the sorting Hadoop does on the key.
However, you need to write some code to make that happen. Josh wrote a nice series of articles on it, and you will find more if you google for "secondary sort".
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/
Kai
Am 20.09.2011 um 07:43 schrieb Daniel Yehdego:
>
> Good evening,
> I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
>
> Regards,
>
> Daniel T. Yehdego
> Computational Science Program
> University of Texas at El Paso, UTEP
> dtyehdego@miners.utep.edu
--
Kai Voigt
k@123.org
Re: Reducer to concatenate string values
Posted by Ayon Sinha <ay...@yahoo.com>.
Hi Daniel,
There are ways to do what you are asking for, but are you sure you are using the right framework for the right problem? Hadoop's premise is that Reducer values are not sorted. If you want all values to be concatenated in a sorted order one way is to read in all values in the reducer (memory limitation) and sort the values and concatenate them. This solution is obviously not scalable and defeats the purpose of Hadoop's parallelism.
The other way is outlined here http://sonerbalkir.blogspot.com/2010/01/simulating-secondary-sort-on-values.html where you put the value as part of the mapper output key and then override the partitioner to make all the key-values with the same prefix (not the entire key) to go to the same reducer. So you are essentially using hadoop to sort the keys and then sending them to the same reducer.
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.
________________________________
From: Daniel Yehdego <dt...@miners.utep.edu>
To: common-user@hadoop.apache.org
Sent: Tuesday, September 20, 2011 8:38 AM
Subject: RE: Reducer to concatenate string values
Hi Ayon,
any idea on my previous question?
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehdego@miners.utep.edu
> From: dtyehdego@miners.utep.edu
> To: common-user@hadoop.apache.org
> Subject: RE: Reducer to concatenate string values
> Date: Tue, 20 Sep 2011 06:06:22 +0000
>
>
> Hi Ayon,
> I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order.
>
> Regards,
>
> Daniel T. Yehdego
> Computational Science Program
> University of Texas at El Paso, UTEP
> dtyehdego@miners.utep.edu
>
> > Date: Mon, 19 Sep 2011 22:54:46 -0700
> > From: ayonsinha@yahoo.com
> > Subject: Re: Reducer to concatenate string values
> > To: common-user@hadoop.apache.org
> >
> > What are you using for your map/reduce? Streaming/Java/Pig/Hive?
> >
> > -Ayon
> > See My Photos on Flickr
> > Also check out my Blog for answers to commonly asked questions.
> >
> >
> >
> > ________________________________
> > From: Daniel Yehdego <dt...@miners.utep.edu>
> > To: common-user@hadoop.apache.org
> > Sent: Monday, September 19, 2011 10:43 PM
> > Subject: Reducer to concatenate string values
> >
> >
> > Good evening,
> > I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
> >
> > Regards,
> >
> > Daniel T. Yehdego
> > Computational Science Program
> > University of Texas at El Paso, UTEP
> > dtyehdego@miners.utep.edu
>
Re: Reducer to concatenate string values
Posted by Shi Yu <sh...@uchicago.edu>.
Hi,
You probably need to use secondary sort (based on TextPair key) and
string concatenation function (like StringBuffer) to do this. I once
had a talk on Open Cloud Science workshop about this (also see my
previous post in this newsgroup)
Best,
Shi
On 9/20/2011 10:38 AM, Daniel Yehdego wrote:
> Hi Ayon,
> any idea on my previous question?
>
> Regards,
>
> Daniel T. Yehdego
> Computational Science Program
> University of Texas at El Paso, UTEP
> dtyehdego@miners.utep.edu
>
>> From: dtyehdego@miners.utep.edu
>> To: common-user@hadoop.apache.org
>> Subject: RE: Reducer to concatenate string values
>> Date: Tue, 20 Sep 2011 06:06:22 +0000
>>
>>
>> Hi Ayon,
>> I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order.
>>
>> Regards,
>>
>> Daniel T. Yehdego
>> Computational Science Program
>> University of Texas at El Paso, UTEP
>> dtyehdego@miners.utep.edu
>>
>>> Date: Mon, 19 Sep 2011 22:54:46 -0700
>>> From: ayonsinha@yahoo.com
>>> Subject: Re: Reducer to concatenate string values
>>> To: common-user@hadoop.apache.org
>>>
>>> What are you using for your map/reduce? Streaming/Java/Pig/Hive?
>>>
>>> -Ayon
>>> See My Photos on Flickr
>>> Also check out my Blog for answers to commonly asked questions.
>>>
>>>
>>>
>>> ________________________________
>>> From: Daniel Yehdego<dt...@miners.utep.edu>
>>> To: common-user@hadoop.apache.org
>>> Sent: Monday, September 19, 2011 10:43 PM
>>> Subject: Reducer to concatenate string values
>>>
>>>
>>> Good evening,
>>> I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
>>>
>>> Regards,
>>>
>>> Daniel T. Yehdego
>>> Computational Science Program
>>> University of Texas at El Paso, UTEP
>>> dtyehdego@miners.utep.edu
>>
>
RE: Reducer to concatenate string values
Posted by Daniel Yehdego <dt...@miners.utep.edu>.
Hi Ayon,
any idea on my previous question?
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehdego@miners.utep.edu
> From: dtyehdego@miners.utep.edu
> To: common-user@hadoop.apache.org
> Subject: RE: Reducer to concatenate string values
> Date: Tue, 20 Sep 2011 06:06:22 +0000
>
>
> Hi Ayon,
> I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order.
>
> Regards,
>
> Daniel T. Yehdego
> Computational Science Program
> University of Texas at El Paso, UTEP
> dtyehdego@miners.utep.edu
>
> > Date: Mon, 19 Sep 2011 22:54:46 -0700
> > From: ayonsinha@yahoo.com
> > Subject: Re: Reducer to concatenate string values
> > To: common-user@hadoop.apache.org
> >
> > What are you using for your map/reduce? Streaming/Java/Pig/Hive?
> >
> > -Ayon
> > See My Photos on Flickr
> > Also check out my Blog for answers to commonly asked questions.
> >
> >
> >
> > ________________________________
> > From: Daniel Yehdego <dt...@miners.utep.edu>
> > To: common-user@hadoop.apache.org
> > Sent: Monday, September 19, 2011 10:43 PM
> > Subject: Reducer to concatenate string values
> >
> >
> > Good evening,
> > I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
> >
> > Regards,
> >
> > Daniel T. Yehdego
> > Computational Science Program
> > University of Texas at El Paso, UTEP
> > dtyehdego@miners.utep.edu
>
RE: Reducer to concatenate string values
Posted by Daniel Yehdego <dt...@miners.utep.edu>.
Hi Ayon,
I am using a C executable as my mapper (streaming), but I am not sure how to use a reducer that concatenates the values from a mapper in order.
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehdego@miners.utep.edu
> Date: Mon, 19 Sep 2011 22:54:46 -0700
> From: ayonsinha@yahoo.com
> Subject: Re: Reducer to concatenate string values
> To: common-user@hadoop.apache.org
>
> What are you using for your map/reduce? Streaming/Java/Pig/Hive?
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>
> ________________________________
> From: Daniel Yehdego <dt...@miners.utep.edu>
> To: common-user@hadoop.apache.org
> Sent: Monday, September 19, 2011 10:43 PM
> Subject: Reducer to concatenate string values
>
>
> Good evening,
> I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
>
> Regards,
>
> Daniel T. Yehdego
> Computational Science Program
> University of Texas at El Paso, UTEP
> dtyehdego@miners.utep.edu
Re: Reducer to concatenate string values
Posted by Ayon Sinha <ay...@yahoo.com>.
What are you using for your map/reduce? Streaming/Java/Pig/Hive?
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.
________________________________
From: Daniel Yehdego <dt...@miners.utep.edu>
To: common-user@hadoop.apache.org
Sent: Monday, September 19, 2011 10:43 PM
Subject: Reducer to concatenate string values
Good evening,
I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. How can I use a reducer that receives a value from a mapper output and concatenate the strings in order. waiting your response and thanks in advance.
Regards,
Daniel T. Yehdego
Computational Science Program
University of Texas at El Paso, UTEP
dtyehdego@miners.utep.edu