You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Björn-Elmar Macek <ma...@cs.uni-kassel.de> on 2012/05/10 11:36:05 UTC
Is the order of collected outputs in the map step preserved til the
reduce step?
Hello all,
i am currently working with a set of data which is chronologically
ordered (every data element has a timestamp and they are monotonically
increasing). Please correct me, if i am mistaken, but the data should
"arrive" chronologically ordered at the mapper, right? But is the order
in which i push values to the output preserved so that the Iterator
given as a parameter of the reduce-function contains the values also
chronologically ordered?
Thank you for help in advance!
Best regards,
Björn
Re: Is the order of collected outputs in the map step preserved til
the reduce step?
Posted by Harsh J <ha...@cloudera.com>.
There is no such guarantee made by the framework. The only guarantee
is made at the key-sort level, that ensures that each iteration of
reduce() only carries one key and all associated values (in no
particular order) and that the keys overall are iterated in proper,
sorted order.
However, you can solve this form of a requirement using a secondary
sort technique:
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/
On Thu, May 10, 2012 at 3:06 PM, Björn-Elmar Macek
<ma...@cs.uni-kassel.de> wrote:
> Hello all,
>
> i am currently working with a set of data which is chronologically ordered
> (every data element has a timestamp and they are monotonically increasing).
> Please correct me, if i am mistaken, but the data should "arrive"
> chronologically ordered at the mapper, right? But is the order in which i
> push values to the output preserved so that the Iterator given as a
> parameter of the reduce-function contains the values also chronologically
> ordered?
>
> Thank you for help in advance!
>
> Best regards,
> Björn
--
Harsh J