You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nathan Fiedler <na...@gmail.com> on 2008/05/22 19:14:52 UTC

Follow-up to sorting question from user meeting

In yesterday's user group meeting, Ted asked if the new sort function
in the MapReduce implementation was stable, since typically quicksort
is not stable for efficient implementations. Assuming Hadoop is using
one of the sort() methods in java.util.Arrays based on quicksort, then
there is a very high likelihood that the sort is not stable. Those
sort() methods explain that the implementation is based on Jon L.
Bentley and M. Douglas McIlroy's "Engineering a Sort Function",
Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November
1993). Looking at that document
(http://citeseer.ist.psu.edu/bentley93engineering.html) it's clear
their algorithm is not stable.

What this means to MapReduce is not clear to me, but I wanted to
provide the follow-up since no one in the meeting knew the answer off
the top of their head.

BTW, thanks for a really enjoyable user group meeting. I'm looking
forward to the next one.

n