You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Björn-Elmar Macek <ma...@cs.uni-kassel.de> on 2012/05/10 11:36:05 UTC

Is the order of collected outputs in the map step preserved til the reduce step?

Hello all,

i am currently working with a set of data which is chronologically 
ordered (every data element has a timestamp and they are monotonically 
increasing). Please correct me, if i am mistaken, but the data should 
"arrive" chronologically ordered at the mapper, right? But is the order 
in which i push values to the output preserved so that the Iterator 
given as a parameter of the reduce-function contains the values also 
chronologically ordered?

Thank you for help in advance!

Best regards,
Björn

Re: Is the order of collected outputs in the map step preserved til the reduce step?

Posted by Harsh J <ha...@cloudera.com>.
There is no such guarantee made by the framework. The only guarantee
is made at the key-sort level, that ensures that each iteration of
reduce() only carries one key and all associated values (in no
particular order) and that the keys overall are iterated in proper,
sorted order.

However, you can solve this form of a requirement using a secondary
sort technique:
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

On Thu, May 10, 2012 at 3:06 PM, Björn-Elmar Macek
<ma...@cs.uni-kassel.de> wrote:
> Hello all,
>
> i am currently working with a set of data which is chronologically ordered
> (every data element has a timestamp and they are monotonically increasing).
> Please correct me, if i am mistaken, but the data should "arrive"
> chronologically ordered at the mapper, right? But is the order in which i
> push values to the output preserved so that the Iterator given as a
> parameter of the reduce-function contains the values also chronologically
> ordered?
>
> Thank you for help in advance!
>
> Best regards,
> Björn



-- 
Harsh J