You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by WangRamon <ra...@hotmail.com> on 2013/07/19 08:23:43 UTC

How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:% bin/hadoop-ec2 launch-cluster test-cluster 2
The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?2. How it is charged? Nodes number * price per node per hour ?3. Is each node like a single EC2 instance in my admin console? 
Thanks in advance!
CheersRamon
 		 	   		  

Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by Mischa Tuffield <mi...@mmt.me.uk>.
Hey, 

On 19 Jul 2013, at 07:55, WangRamon <ra...@hotmail.com> wrote:

> Hi Tianyi
> 
> Thanks for the reply, that's really help. So i have two further questions:
> 
> 1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.

You can pass arguments to a bootstrap command called "configure-hadoop" that is provided by the AWS folk, like so (I do this all the time)

 --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \
  --args "-m","mapred.reduce.child.java.opts=-Xmx7168m","-s","mapred.tasktracker.reduce.tasks.maximum=80","-s","mapred.reduce.tasks=80" \


> 2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.

An "EMR node" in this case is an EC2 instance running an AMI which the AWS folk have configured and install a version of hadoop on. 

You can find the EMR AMI for EC2 by searching for AWS157 under AMIs.

Mischa

> 
> Cheers
> Ramon
> 
> Date: Fri, 19 Jul 2013 16:37:21 +1000
> Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
> From: tianyi.zhu@facilitatedigital.com
> To: user@hadoop.apache.org
> 
> 1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
> 2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)
> 3. Yes, you can find them in admin console.
> 
> 
> On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:
> Hi All
> 
> We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
> 
> The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
> 
> 1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console? 
> 
> Thanks in advance!
> 
> Cheers
> Ramon

_______________________________
Mischa Tuffield PhD
http://mmt.me.uk/
@mischat






Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by Mischa Tuffield <mi...@mmt.me.uk>.
Hey, 

On 19 Jul 2013, at 07:55, WangRamon <ra...@hotmail.com> wrote:

> Hi Tianyi
> 
> Thanks for the reply, that's really help. So i have two further questions:
> 
> 1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.

You can pass arguments to a bootstrap command called "configure-hadoop" that is provided by the AWS folk, like so (I do this all the time)

 --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \
  --args "-m","mapred.reduce.child.java.opts=-Xmx7168m","-s","mapred.tasktracker.reduce.tasks.maximum=80","-s","mapred.reduce.tasks=80" \


> 2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.

An "EMR node" in this case is an EC2 instance running an AMI which the AWS folk have configured and install a version of hadoop on. 

You can find the EMR AMI for EC2 by searching for AWS157 under AMIs.

Mischa

> 
> Cheers
> Ramon
> 
> Date: Fri, 19 Jul 2013 16:37:21 +1000
> Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
> From: tianyi.zhu@facilitatedigital.com
> To: user@hadoop.apache.org
> 
> 1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
> 2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)
> 3. Yes, you can find them in admin console.
> 
> 
> On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:
> Hi All
> 
> We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
> 
> The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
> 
> 1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console? 
> 
> Thanks in advance!
> 
> Cheers
> Ramon

_______________________________
Mischa Tuffield PhD
http://mmt.me.uk/
@mischat






Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by Mischa Tuffield <mi...@mmt.me.uk>.
Hey, 

On 19 Jul 2013, at 07:55, WangRamon <ra...@hotmail.com> wrote:

> Hi Tianyi
> 
> Thanks for the reply, that's really help. So i have two further questions:
> 
> 1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.

You can pass arguments to a bootstrap command called "configure-hadoop" that is provided by the AWS folk, like so (I do this all the time)

 --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \
  --args "-m","mapred.reduce.child.java.opts=-Xmx7168m","-s","mapred.tasktracker.reduce.tasks.maximum=80","-s","mapred.reduce.tasks=80" \


> 2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.

An "EMR node" in this case is an EC2 instance running an AMI which the AWS folk have configured and install a version of hadoop on. 

You can find the EMR AMI for EC2 by searching for AWS157 under AMIs.

Mischa

> 
> Cheers
> Ramon
> 
> Date: Fri, 19 Jul 2013 16:37:21 +1000
> Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
> From: tianyi.zhu@facilitatedigital.com
> To: user@hadoop.apache.org
> 
> 1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
> 2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)
> 3. Yes, you can find them in admin console.
> 
> 
> On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:
> Hi All
> 
> We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
> 
> The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
> 
> 1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console? 
> 
> Thanks in advance!
> 
> Cheers
> Ramon

_______________________________
Mischa Tuffield PhD
http://mmt.me.uk/
@mischat






Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by Mischa Tuffield <mi...@mmt.me.uk>.
Hey, 

On 19 Jul 2013, at 07:55, WangRamon <ra...@hotmail.com> wrote:

> Hi Tianyi
> 
> Thanks for the reply, that's really help. So i have two further questions:
> 
> 1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.

You can pass arguments to a bootstrap command called "configure-hadoop" that is provided by the AWS folk, like so (I do this all the time)

 --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \
  --args "-m","mapred.reduce.child.java.opts=-Xmx7168m","-s","mapred.tasktracker.reduce.tasks.maximum=80","-s","mapred.reduce.tasks=80" \


> 2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.

An "EMR node" in this case is an EC2 instance running an AMI which the AWS folk have configured and install a version of hadoop on. 

You can find the EMR AMI for EC2 by searching for AWS157 under AMIs.

Mischa

> 
> Cheers
> Ramon
> 
> Date: Fri, 19 Jul 2013 16:37:21 +1000
> Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
> From: tianyi.zhu@facilitatedigital.com
> To: user@hadoop.apache.org
> 
> 1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
> 2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)
> 3. Yes, you can find them in admin console.
> 
> 
> On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:
> Hi All
> 
> We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
> 
> The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
> 
> 1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console? 
> 
> Thanks in advance!
> 
> Cheers
> Ramon

_______________________________
Mischa Tuffield PhD
http://mmt.me.uk/
@mischat






RE: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by WangRamon <ra...@hotmail.com>.
Hi Tianyi
Thanks for the reply, that's really help. So i have two further questions:
1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.
CheersRamon

Date: Fri, 19 Jul 2013 16:37:21 +1000
Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
From: tianyi.zhu@facilitatedigital.com
To: user@hadoop.apache.org

1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:




Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:% bin/hadoop-ec2 launch-cluster test-cluster 2

The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
2. How it is charged? Nodes number * price per node per hour ?3. Is each node like a single EC2 instance in my admin console? 

Thanks in advance!
CheersRamon
 		 	   		  

 		 	   		  

RE: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by WangRamon <ra...@hotmail.com>.
Hi Tianyi
Thanks for the reply, that's really help. So i have two further questions:
1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.
CheersRamon

Date: Fri, 19 Jul 2013 16:37:21 +1000
Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
From: tianyi.zhu@facilitatedigital.com
To: user@hadoop.apache.org

1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:




Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:% bin/hadoop-ec2 launch-cluster test-cluster 2

The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
2. How it is charged? Nodes number * price per node per hour ?3. Is each node like a single EC2 instance in my admin console? 

Thanks in advance!
CheersRamon
 		 	   		  

 		 	   		  

RE: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by WangRamon <ra...@hotmail.com>.
Hi Tianyi
Thanks for the reply, that's really help. So i have two further questions:
1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.
CheersRamon

Date: Fri, 19 Jul 2013 16:37:21 +1000
Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
From: tianyi.zhu@facilitatedigital.com
To: user@hadoop.apache.org

1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:




Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:% bin/hadoop-ec2 launch-cluster test-cluster 2

The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
2. How it is charged? Nodes number * price per node per hour ?3. Is each node like a single EC2 instance in my admin console? 

Thanks in advance!
CheersRamon
 		 	   		  

 		 	   		  

RE: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by WangRamon <ra...@hotmail.com>.
Hi Tianyi
Thanks for the reply, that's really help. So i have two further questions:
1.  You said i can customize the number of the slots on AWS, how to do it? i know i can do it in the mapred-site.xml if i created the cluster without AWS.2.  You mentioned about the EMR node, will the hadoop-ec2 launch-cluster command start EMR node or common EC2 instance? Thanks a lot.
CheersRamon

Date: Fri, 19 Jul 2013 16:37:21 +1000
Subject: Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?
From: tianyi.zhu@facilitatedigital.com
To: user@hadoop.apache.org

1. Yes, it's depends on instance type. Generally, number of map slots + number of reduce slots = number of ECU, number of map slots / number of reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR node is a little bit more expensive than EC2 node)3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:




Hi All
We have a plan to move to Amazon AWS cloud, by doing some research i find that i can start the map/reduce cluster in AWS with the following command:% bin/hadoop-ec2 launch-cluster test-cluster 2

The command allows me to start a cluster with required nodes(no more than 20, correct me if i were wrong), so here comes to my questions:
1. How does AWS know how many map/reduce slot should be configured to each EC2 instance? Is it depends on the EC2 instance type (m1.large, m1.xlarge...)?
2. How it is charged? Nodes number * price per node per hour ?3. Is each node like a single EC2 instance in my admin console? 

Thanks in advance!
CheersRamon
 		 	   		  

 		 	   		  

Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by TianYi Zhu <ti...@facilitatedigital.com>.
1. Yes, it's depends on instance type. Generally, number of map slots +
number of reduce slots = number of ECU, number of map slots / number of
reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR
node is a little bit more expensive than EC2 node)
3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:

> Hi All
>
> We have a plan to move to Amazon AWS cloud, by doing some research i find
> that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
>
> The command allows me to start a cluster with required nodes(no more than
> 20, correct me if i were wrong), so here comes to my questions:
>
> 1. How does AWS know how many map/reduce slot should be configured to each
> EC2 instance? Is it depends on the EC2 instance type (m1.large,
> m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console?
>
> Thanks in advance!
>
> Cheers
> Ramon
>
>

Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by TianYi Zhu <ti...@facilitatedigital.com>.
1. Yes, it's depends on instance type. Generally, number of map slots +
number of reduce slots = number of ECU, number of map slots / number of
reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR
node is a little bit more expensive than EC2 node)
3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:

> Hi All
>
> We have a plan to move to Amazon AWS cloud, by doing some research i find
> that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
>
> The command allows me to start a cluster with required nodes(no more than
> 20, correct me if i were wrong), so here comes to my questions:
>
> 1. How does AWS know how many map/reduce slot should be configured to each
> EC2 instance? Is it depends on the EC2 instance type (m1.large,
> m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console?
>
> Thanks in advance!
>
> Cheers
> Ramon
>
>

Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by TianYi Zhu <ti...@facilitatedigital.com>.
1. Yes, it's depends on instance type. Generally, number of map slots +
number of reduce slots = number of ECU, number of map slots / number of
reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR
node is a little bit more expensive than EC2 node)
3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:

> Hi All
>
> We have a plan to move to Amazon AWS cloud, by doing some research i find
> that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
>
> The command allows me to start a cluster with required nodes(no more than
> 20, correct me if i were wrong), so here comes to my questions:
>
> 1. How does AWS know how many map/reduce slot should be configured to each
> EC2 instance? Is it depends on the EC2 instance type (m1.large,
> m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console?
>
> Thanks in advance!
>
> Cheers
> Ramon
>
>

Re: How does AWS know how many map/reduce slot should be configured to each EC2 instance?

Posted by TianYi Zhu <ti...@facilitatedigital.com>.
1. Yes, it's depends on instance type. Generally, number of map slots +
number of reduce slots = number of ECU, number of map slots / number of
reduce slots >= 3. You can customize these numbers.
2. Yes, Number of nodes * Running hours * Price per EMR node per hour (EMR
node is a little bit more expensive than EC2 node)
3. Yes, you can find them in admin console.


On 19 July 2013 16:23, WangRamon <ra...@hotmail.com> wrote:

> Hi All
>
> We have a plan to move to Amazon AWS cloud, by doing some research i find
> that i can start the map/reduce cluster in AWS with the following command:
> % bin/hadoop-ec2 launch-cluster test-cluster 2
>
> The command allows me to start a cluster with required nodes(no more than
> 20, correct me if i were wrong), so here comes to my questions:
>
> 1. How does AWS know how many map/reduce slot should be configured to each
> EC2 instance? Is it depends on the EC2 instance type (m1.large,
> m1.xlarge...)?
> 2. How it is charged? Nodes number * price per node per hour ?
> 3. Is each node like a single EC2 instance in my admin console?
>
> Thanks in advance!
>
> Cheers
> Ramon
>
>