You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by jag123 <ja...@yahoo.com> on 2007/12/26 23:52:38 UTC

Performance issues with large map/reduce processes

Hi,

I am running a map/reduce task on a large cluster (70+ machines). I use a
single input file, and sufficient number of map/reduce tasks so that each
map process gets 250k records. That is, if my  input file contains 1 
million records, I use 4 map and 4 reduce processes so that each map process
gets 250k records.  The maps/reduce usually takes 30 seconds to complete.

A strange thing happens when I scale this problem:

1 million records, 4 map + 4 reduce ==> 30 seconds per map process
5 million records, 20 map + 20 reduce ==>  1 minute per map process
50 million records, 200 map + 200 reduce ==>  3 minute per map process
500 million records, 2000 map + 2000 reduces ==> 45 minutes! per map process

Note that in all the above cases, the map process performs the same amount
of work (250k records).

In all the cases, I use a single large input file. Hadoop breaks the file
into ~16 MB chunks (about 250k records). Input format is
TextInputFormat.class. I cannot think of any reason why this is happening.
The task setup in all the above cases takes 30 seconds or so. But then the
map process practically crawls. 
-- 
View this message in context: http://www.nabble.com/Performance-issues-with-large-map-reduce-processes-tp14507375p14507375.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Performance issues with single/multiple files

Posted by Owen O'Malley <oo...@yahoo-inc.com>.

On Dec 26, 2007, at 2:50 PM, jag123 wrote:

> A strange thing happens when I scale this problem:
>
> 1 million records, 4 map + 4 reduce ==> 30 seconds per map process
> 5 million records, 20 map + 20 reduce ==>  1 minute per map process
> 50 million records, 200 map + 200 reduce ==>  3 minute per map process
> 500 million records, 2000 map + 2000 reduces ==> 45 minutes! per  
> map process

Thanks for sharing your problem. With a 70 node cluster, I'd strongly  
suggest that you cut back to 140 reduces. You should save a lot of  
time in the shuffle.

> The task setup in all the above cases takes 30 seconds or so. But  
> then the
> map process practically crawls.

Look at the web ui and find the execution times on the maps without  
the shuffle. Are they similar between the 50m and 500m runs? If not,  
it would help to get in there with a profiler. Either use one of your  
own or try out my HADOOP-2367 patch. *smile*

-- Owen

Re: Performance issues with large map/reduce processes

Posted by Ted Dunning <td...@veoh.com>.

That is a very small heap.  The reduces, in particular, would benefit
sustantially from having more memory.

Other than that (and having fewer reduces), I am at a bit of a loss.  I know
that others are working on comparably sized problems without much
difficulty.  

There might be an interaction with your small heaps and attempts to buffer
outputs.

On 12/27/07 10:31 AM, "jag123" <ja...@yahoo.com> wrote:

>> What happens on the large size problems if you decrease the number of
>> maps,
>> but keep input size constant?
> 
> My map processes run out of Heap space if I decrease the number of map
> processes. 
> The maximum number of records that my map process can handle is 250k as my
> heap size is only 256m.

Re: Performance issues with large map/reduce processes

Posted by jag123 <ja...@yahoo.com>.

> Can you say a bit more about your processes?  Are they truly parallel maps
> without any shared state?

My map process reads a line of record from the input and explodes it into a
bunch of
key and value pairs. The number of key-value pair is typically small (around
8 on the average) 
and is independent of the size of the dataset. That is, it only depends on
the input record. 

I use a bunch of reporter objects to monitor the performance of the
map/reduce processes,
but I guess that does not affect performance. I do not access the disk etc. 
The map operation is purely CPU based.

> Are you getting a good limit on maximum number of maps and reduces per
> machine?

The number of maps/reduce per machine is set to two by the administrator. 
I am getting around 140 maps from the 70 machine cluster. 

> How are you measuring these times?  Do they include shuffle time as well
> as
> map time?  Do they include time before running?

Yes - the time includes both the shuffle time and map time. The shuffle time
is less than 30 seconds 
even for the largest tasks as I can see that it has started to process the
records 
(although very very slowly).
I can monitor the progress of each map process using the "reporter" objects.

> What happens on the large size problems if you decrease the number of
> maps,
> but keep input size constant?

My map processes run out of Heap space if I decrease the number of map
processes. 
The maximum number of records that my map process can handle is 250k as my
heap size is only 256m.

> Finally, why do you have so many reduces?  Usually it is good to have at
> most a small multiple of the number of machines in your cluster.

Yes - I agree and thanks for pointing out. In the large setup, I rarely get
past the map 
stage because it takes tens of hours for all the maps to finish.

> On 12/26/07 2:52 PM, "jag123" wrote:
> 
>> 
>> Hi,
>> 
>> I am running a map/reduce task on a large cluster (70+ machines). I use a
>> single input file, and sufficient number of map/reduce tasks so that each
>> map process gets 250k records. That is, if my  input file contains 1
>> million records, I use 4 map and 4 reduce processes so that each map
>> process
>> gets 250k records.  The maps/reduce usually takes 30 seconds to complete.
>> 
>> A strange thing happens when I scale this problem:
>> 
>> 1 million records, 4 map + 4 reduce ==> 30 seconds per map process
>> 5 million records, 20 map + 20 reduce ==>  1 minute per map process
>> 50 million records, 200 map + 200 reduce ==>  3 minute per map process
>> 500 million records, 2000 map + 2000 reduces ==> 45 minutes! per map
>> process
>> 
>> Note that in all the above cases, the map process performs the same
>> amount
>> of work (250k records).
>> 
>> In all the cases, I use a single large input file. Hadoop breaks the file
>> into ~16 MB chunks (about 250k records). Input format is
>> TextInputFormat.class. I cannot think of any reason why this is
>> happening.
>> The task setup in all the above cases takes 30 seconds or so. But then
>> the
>> map process practically crawls. 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Performance-issues-with-large-map-reduce-processes-tp14507375p14516774.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Performance issues with large map/reduce processes

Posted by Ted Dunning <td...@veoh.com>.

Can you say a bit more about your processes?  Are they truly parallel maps
without any shared state?

Are you getting a good limit on maximum number of maps and reduces per
machine?

How are you measuring these times?  Do they include shuffle time as well as
map time?  Do they include time before running?

What happens on the large size problems if you decrease the number of maps,
but keep input size constant?

Finally, why do you have so many reduces?  Usually it is good to have at
most a small multiple of the number of machines in your cluster.

On 12/26/07 2:52 PM, "jag123" <ja...@yahoo.com> wrote:

> 
> Hi,
> 
> I am running a map/reduce task on a large cluster (70+ machines). I use a
> single input file, and sufficient number of map/reduce tasks so that each
> map process gets 250k records. That is, if my  input file contains 1
> million records, I use 4 map and 4 reduce processes so that each map process
> gets 250k records.  The maps/reduce usually takes 30 seconds to complete.
> 
> A strange thing happens when I scale this problem:
> 
> 1 million records, 4 map + 4 reduce ==> 30 seconds per map process
> 5 million records, 20 map + 20 reduce ==>  1 minute per map process
> 50 million records, 200 map + 200 reduce ==>  3 minute per map process
> 500 million records, 2000 map + 2000 reduces ==> 45 minutes! per map process
> 
> Note that in all the above cases, the map process performs the same amount
> of work (250k records).
> 
> In all the cases, I use a single large input file. Hadoop breaks the file
> into ~16 MB chunks (about 250k records). Input format is
> TextInputFormat.class. I cannot think of any reason why this is happening.
> The task setup in all the above cases takes 30 seconds or so. But then the
> map process practically crawls.