You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Guang Yang <gy...@millennialmedia.com> on 2012/12/15 00:03:54 UTC

question of how to take full advantage of cluster resources

Hi,

We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses.

Thanks,
Guang Yang

Re: question of how to take full advantage of cluster resources

Posted by Harsh J <ha...@cloudera.com>.
Please add in your RAM details as well, as that matters for
concurrently spawned JVMs.

On Sat, Dec 15, 2012 at 4:33 AM, Guang Yang <gy...@millennialmedia.com> wrote:
> Hi,
>
> We have a beefy Hadoop cluster with 12 worker nodes and each one with 32
> cores. We have been running Map/reduce jobs on this cluster and we noticed
> that if we configure the Map/Reduce capacity in the cluster to be less than
> the available processors in the cluster (32 x 12 = 384), say 216 map slots
> and 144 reduce slots (360 total), the jobs run okay. But if we configure the
> total Map/Reduce capacity to be more than 384, we observe that sometimes job
> runs unusual long and the symptom is that certain tasks (usually map tasks)
> are stuck in "initializing" stage for a long time on certain nodes, before
> get processed. The nodes exhibiting this behavior are random and not tied to
> specific boxes. Isn't the general rule of thumb of configuring M/R capacity
> to be twice the number of processors in the cluster? What do people usually
> do to try to maximize the usage of the cluster resources in term of cluster
> capacity configuration? I'd appreciate any responses.
>
> Thanks,
> Guang Yang



-- 
Harsh J

Re: question of how to take full advantage of cluster resources

Posted by Harsh J <ha...@cloudera.com>.
Please add in your RAM details as well, as that matters for
concurrently spawned JVMs.

On Sat, Dec 15, 2012 at 4:33 AM, Guang Yang <gy...@millennialmedia.com> wrote:
> Hi,
>
> We have a beefy Hadoop cluster with 12 worker nodes and each one with 32
> cores. We have been running Map/reduce jobs on this cluster and we noticed
> that if we configure the Map/Reduce capacity in the cluster to be less than
> the available processors in the cluster (32 x 12 = 384), say 216 map slots
> and 144 reduce slots (360 total), the jobs run okay. But if we configure the
> total Map/Reduce capacity to be more than 384, we observe that sometimes job
> runs unusual long and the symptom is that certain tasks (usually map tasks)
> are stuck in "initializing" stage for a long time on certain nodes, before
> get processed. The nodes exhibiting this behavior are random and not tied to
> specific boxes. Isn't the general rule of thumb of configuring M/R capacity
> to be twice the number of processors in the cluster? What do people usually
> do to try to maximize the usage of the cluster resources in term of cluster
> capacity configuration? I'd appreciate any responses.
>
> Thanks,
> Guang Yang



-- 
Harsh J

Re: question of how to take full advantage of cluster resources

Posted by Harsh J <ha...@cloudera.com>.
Please add in your RAM details as well, as that matters for
concurrently spawned JVMs.

On Sat, Dec 15, 2012 at 4:33 AM, Guang Yang <gy...@millennialmedia.com> wrote:
> Hi,
>
> We have a beefy Hadoop cluster with 12 worker nodes and each one with 32
> cores. We have been running Map/reduce jobs on this cluster and we noticed
> that if we configure the Map/Reduce capacity in the cluster to be less than
> the available processors in the cluster (32 x 12 = 384), say 216 map slots
> and 144 reduce slots (360 total), the jobs run okay. But if we configure the
> total Map/Reduce capacity to be more than 384, we observe that sometimes job
> runs unusual long and the symptom is that certain tasks (usually map tasks)
> are stuck in "initializing" stage for a long time on certain nodes, before
> get processed. The nodes exhibiting this behavior are random and not tied to
> specific boxes. Isn't the general rule of thumb of configuring M/R capacity
> to be twice the number of processors in the cluster? What do people usually
> do to try to maximize the usage of the cluster resources in term of cluster
> capacity configuration? I'd appreciate any responses.
>
> Thanks,
> Guang Yang



-- 
Harsh J

Re: question of how to take full advantage of cluster resources

Posted by Harsh J <ha...@cloudera.com>.
Please add in your RAM details as well, as that matters for
concurrently spawned JVMs.

On Sat, Dec 15, 2012 at 4:33 AM, Guang Yang <gy...@millennialmedia.com> wrote:
> Hi,
>
> We have a beefy Hadoop cluster with 12 worker nodes and each one with 32
> cores. We have been running Map/reduce jobs on this cluster and we noticed
> that if we configure the Map/Reduce capacity in the cluster to be less than
> the available processors in the cluster (32 x 12 = 384), say 216 map slots
> and 144 reduce slots (360 total), the jobs run okay. But if we configure the
> total Map/Reduce capacity to be more than 384, we observe that sometimes job
> runs unusual long and the symptom is that certain tasks (usually map tasks)
> are stuck in "initializing" stage for a long time on certain nodes, before
> get processed. The nodes exhibiting this behavior are random and not tied to
> specific boxes. Isn't the general rule of thumb of configuring M/R capacity
> to be twice the number of processors in the cluster? What do people usually
> do to try to maximize the usage of the cluster resources in term of cluster
> capacity configuration? I'd appreciate any responses.
>
> Thanks,
> Guang Yang



-- 
Harsh J

Re: question of how to take full advantage of cluster resources

Posted by Jeffrey Buell <jb...@vmware.com>.
Number of CPU cores is just one of several hardware constraints on the number of tasks that can be run efficiently at the same time. Other constraints: 

- Usually 1 to 2 map tasks per physical disk 
- Leave half the memory of the machine for the buffer cache and other things, and note that the task memory might be twice the maximum heap size. I'd say 4 GB/core is minimum, 8-12 GB/core would be better. 
- With 32 cores you need at least 10 GbE networking 

Jeff 

----- Original Message -----

From: "Guang Yang" <gy...@millennialmedia.com> 
To: user@hadoop.apache.org 
Cc: "Peter Sheridan" <ps...@millennialmedia.com>, "Jim Brooks" <jb...@millennialmedia.com> 
Sent: Friday, December 14, 2012 3:03:54 PM 
Subject: question of how to take full advantage of cluster resources 


Hi, 


We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses. 


Thanks, 
Guang Yang 

Re: question of how to take full advantage of cluster resources

Posted by Jeffrey Buell <jb...@vmware.com>.
Number of CPU cores is just one of several hardware constraints on the number of tasks that can be run efficiently at the same time. Other constraints: 

- Usually 1 to 2 map tasks per physical disk 
- Leave half the memory of the machine for the buffer cache and other things, and note that the task memory might be twice the maximum heap size. I'd say 4 GB/core is minimum, 8-12 GB/core would be better. 
- With 32 cores you need at least 10 GbE networking 

Jeff 

----- Original Message -----

From: "Guang Yang" <gy...@millennialmedia.com> 
To: user@hadoop.apache.org 
Cc: "Peter Sheridan" <ps...@millennialmedia.com>, "Jim Brooks" <jb...@millennialmedia.com> 
Sent: Friday, December 14, 2012 3:03:54 PM 
Subject: question of how to take full advantage of cluster resources 


Hi, 


We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses. 


Thanks, 
Guang Yang 

Re: question of how to take full advantage of cluster resources

Posted by Jeffrey Buell <jb...@vmware.com>.
Number of CPU cores is just one of several hardware constraints on the number of tasks that can be run efficiently at the same time. Other constraints: 

- Usually 1 to 2 map tasks per physical disk 
- Leave half the memory of the machine for the buffer cache and other things, and note that the task memory might be twice the maximum heap size. I'd say 4 GB/core is minimum, 8-12 GB/core would be better. 
- With 32 cores you need at least 10 GbE networking 

Jeff 

----- Original Message -----

From: "Guang Yang" <gy...@millennialmedia.com> 
To: user@hadoop.apache.org 
Cc: "Peter Sheridan" <ps...@millennialmedia.com>, "Jim Brooks" <jb...@millennialmedia.com> 
Sent: Friday, December 14, 2012 3:03:54 PM 
Subject: question of how to take full advantage of cluster resources 


Hi, 


We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses. 


Thanks, 
Guang Yang 

Re: question of how to take full advantage of cluster resources

Posted by Jeffrey Buell <jb...@vmware.com>.
Number of CPU cores is just one of several hardware constraints on the number of tasks that can be run efficiently at the same time. Other constraints: 

- Usually 1 to 2 map tasks per physical disk 
- Leave half the memory of the machine for the buffer cache and other things, and note that the task memory might be twice the maximum heap size. I'd say 4 GB/core is minimum, 8-12 GB/core would be better. 
- With 32 cores you need at least 10 GbE networking 

Jeff 

----- Original Message -----

From: "Guang Yang" <gy...@millennialmedia.com> 
To: user@hadoop.apache.org 
Cc: "Peter Sheridan" <ps...@millennialmedia.com>, "Jim Brooks" <jb...@millennialmedia.com> 
Sent: Friday, December 14, 2012 3:03:54 PM 
Subject: question of how to take full advantage of cluster resources 


Hi, 


We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses. 


Thanks, 
Guang Yang