You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Torsten Curdt <tc...@vafer.org> on 2010/06/22 02:14:55 UTC

limit of values in reduce phase?

I was just wondering the other day:

What if the the values for a key that get passed into the reducer do
not fit into memory?
After all a reducer should get all values per key from the whole job.
Is the iterator disk backed?

cheers
--
Torsten

Re: limit of values in reduce phase?

Posted by Torsten Curdt <tc...@vafer.org>.

Cool. Great :)

On Tue, Jun 22, 2010 at 07:47, Owen O'Malley <om...@apache.org> wrote:
>
> On Jun 21, 2010, at 5:14 PM, Torsten Curdt wrote:
>
>> I was just wondering the other day:
>>
>> What if the the values for a key that get passed into the reducer do
>> not fit into memory?
>> After all a reducer should get all values per key from the whole job.
>> Is the iterator disk backed?
>
> There is no assumption that all of the values fit into memory. The iterator
> is really the
> result of a merge sort from disk and/or memory.
>
> -- Owen=
>

Re: limit of values in reduce phase?

Posted by Owen O'Malley <om...@apache.org>.

On Jun 21, 2010, at 5:14 PM, Torsten Curdt wrote:

> I was just wondering the other day:
>
> What if the the values for a key that get passed into the reducer do
> not fit into memory?
> After all a reducer should get all values per key from the whole job.
> Is the iterator disk backed?

There is no assumption that all of the values fit into memory. The  
iterator is really the
result of a merge sort from disk and/or memory.

-- Owen=