You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by elton sky <el...@gmail.com> on 2011/02/28 11:30:37 UTC

why quick sort when spill map output?

Hello forumers,

Before spill the data in kvbuffer to local disk in map task, k/v are
sorted using quick sort. The complexity of quick sort is O(nlogn) and
worst case is O(n^2).
Why using quick sort?

Regards

Re: why quick sort when spill map output?

Posted by MANISH SINGLA <co...@gmail.com>.

one of the major reasons of using quicksort would be that quicksort
can easily be parallalized...due to its divide and conquer nature

On Mon, Feb 28, 2011 at 6:06 PM, James Seigel <ja...@tynt.com> wrote:
> Sorting out of the map phase is core to how hadoop works.  Are you asking why sort at all?  or why did someone use quick sort as opposed to _____sort?
>
> Cheers
> James
>
>
> On 2011-02-28, at 3:30 AM, elton sky wrote:
>
>> Hello forumers,
>>
>> Before spill the data in kvbuffer to local disk in map task, k/v are
>> sorted using quick sort. The complexity of quick sort is O(nlogn) and
>> worst case is O(n^2).
>> Why using quick sort?
>>
>> Regards
>
>

Re: why quick sort when spill map output?

Posted by James Seigel <ja...@tynt.com>.

Sorting out of the map phase is core to how hadoop works.  Are you asking why sort at all?  or why did someone use quick sort as opposed to _____sort?

Cheers
James

On 2011-02-28, at 3:30 AM, elton sky wrote:

> Hello forumers,
> 
> Before spill the data in kvbuffer to local disk in map task, k/v are
> sorted using quick sort. The complexity of quick sort is O(nlogn) and
> worst case is O(n^2).
> Why using quick sort?
> 
> Regards