You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Arun C Murthy <ac...@yahoo-inc.com> on 2011/02/01 06:32:48 UTC

Re: Map output in disk and memory at same time?

On Jan 31, 2011, at 10:51 AM, Pedro Costa wrote:

> Hi,
>
> When the reduce fetch from the mappers a map output of the size of 1GB
> and do the merge, is it possible that part of the map output is saved
> in disk and other part in memory?
>

Yes, the reduce tries to keep as much in memory as possible.

If it's under memory pressure it merges and writes to disk.

If there is excess memory it merges into memory.

Then there are 3 kinds of merges going on, depending on available  
memory:
memory-to-memory
memory-to-disk
disk-to-disk

The final merge (from memory and/or disk) feeds the 'reduce' function.

Arun