You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com> on 2014/07/21 20:20:55 UTC

planning a cluster

What is the rule for determining how many nodes should be in your initial cluster?
B.

Re: planning a cluster

Posted by Chris Mawata <ch...@gmail.com>.
If you plan to use it to learn how to program for Hadoop then pseudo
distributed (cluster of 1) will do. If you plan to use it to learn how to
administer a cluster then 4 or 5 nodes will allow experiments with
commissioning and decommissioning nodes, HA, Journaling, etc. If it is a
proof of concept cluster then it depends on the nature of the problem(s)
you want to solve.
On Jul 22, 2014 11:15 AM, "Adaryl "Bob" Wakefield, MBA" <
adaryl.wakefield@hotmail.com> wrote:

>   Someone contacted me directly and suggested the book Hadoop Operations
> by Eric Sammer.
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
>
>  *From:* YIMEN YIMGA Gael <ga...@sgcib.com>
> *Sent:* Tuesday, July 22, 2014 9:48 AM
> *To:* user@hadoop.apache.org
> *Subject:* RE: planning a cluster
>
>
> Hello,
>
>
>
> I can share a clue that i used to fix this.
>
>
>
> If you could calculate the number of nodes that you’ll need after a year,
> then you should make at the startup, a cluster with that number of node. J
>
>
>
> Warm regards
>
>
>
> *From:* Devaraj K [mailto:devaraj@apache.org]
> *Sent:* Tuesday 22 July 2014 16:46
> *To:* user@hadoop.apache.org
> *Subject:* Re: planning a cluster
>
>
>
> You may need to consider these things while choosing no of nodes for your
> cluster.
>
>
>
> 1. Data storage: how much data you are going to store in the cluster
>
> 2. Data processing : what is the processing you are going to do in the
> cluster
>
> 3. Each node hardware configurations
>
>
>
> On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
> What is the rule for determining how many nodes should be in your initial
> cluster?
>
> B.
>
>
>
>
>
> --
>
>
>
>
>
> Thanks
>
> Devaraj K
>
> *************************************************************************
> This message and any attachments (the "message") are confidential,
> intended solely for the addressee(s), and may contain legally privileged
> information.
> Any unauthorised use or dissemination is prohibited. E-mails are
> susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall
> be liable for the message if altered, changed or
> falsified.
> Please visit http://swapdisclosure.sgcib.com for important information
> with respect to derivative products.
>                               ************
> Ce message et toutes les pieces jointes (ci-apres le "message") sont
> confidentiels et susceptibles de contenir des informations couvertes
> par le secret professionnel.
> Ce message est etabli a l'intention exclusive de ses destinataires. Toute
> utilisation ou diffusion non autorisee est interdite.
> Tout message electronique est susceptible d'alteration.
> La SOCIETE GENERALE et ses filiales declinent toute responsabilite au
> titre de ce message s'il a ete altere, deforme ou falsifie.
> Veuillez consulter le site http://swapdisclosure.sgcib.com afin de
> recueillir d'importantes informations sur les produits derives.
> *************************************************************************
>

Re: planning a cluster

Posted by Chris Mawata <ch...@gmail.com>.
If you plan to use it to learn how to program for Hadoop then pseudo
distributed (cluster of 1) will do. If you plan to use it to learn how to
administer a cluster then 4 or 5 nodes will allow experiments with
commissioning and decommissioning nodes, HA, Journaling, etc. If it is a
proof of concept cluster then it depends on the nature of the problem(s)
you want to solve.
On Jul 22, 2014 11:15 AM, "Adaryl "Bob" Wakefield, MBA" <
adaryl.wakefield@hotmail.com> wrote:

>   Someone contacted me directly and suggested the book Hadoop Operations
> by Eric Sammer.
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
>
>  *From:* YIMEN YIMGA Gael <ga...@sgcib.com>
> *Sent:* Tuesday, July 22, 2014 9:48 AM
> *To:* user@hadoop.apache.org
> *Subject:* RE: planning a cluster
>
>
> Hello,
>
>
>
> I can share a clue that i used to fix this.
>
>
>
> If you could calculate the number of nodes that you’ll need after a year,
> then you should make at the startup, a cluster with that number of node. J
>
>
>
> Warm regards
>
>
>
> *From:* Devaraj K [mailto:devaraj@apache.org]
> *Sent:* Tuesday 22 July 2014 16:46
> *To:* user@hadoop.apache.org
> *Subject:* Re: planning a cluster
>
>
>
> You may need to consider these things while choosing no of nodes for your
> cluster.
>
>
>
> 1. Data storage: how much data you are going to store in the cluster
>
> 2. Data processing : what is the processing you are going to do in the
> cluster
>
> 3. Each node hardware configurations
>
>
>
> On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
> What is the rule for determining how many nodes should be in your initial
> cluster?
>
> B.
>
>
>
>
>
> --
>
>
>
>
>
> Thanks
>
> Devaraj K
>
> *************************************************************************
> This message and any attachments (the "message") are confidential,
> intended solely for the addressee(s), and may contain legally privileged
> information.
> Any unauthorised use or dissemination is prohibited. E-mails are
> susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall
> be liable for the message if altered, changed or
> falsified.
> Please visit http://swapdisclosure.sgcib.com for important information
> with respect to derivative products.
>                               ************
> Ce message et toutes les pieces jointes (ci-apres le "message") sont
> confidentiels et susceptibles de contenir des informations couvertes
> par le secret professionnel.
> Ce message est etabli a l'intention exclusive de ses destinataires. Toute
> utilisation ou diffusion non autorisee est interdite.
> Tout message electronique est susceptible d'alteration.
> La SOCIETE GENERALE et ses filiales declinent toute responsabilite au
> titre de ce message s'il a ete altere, deforme ou falsifie.
> Veuillez consulter le site http://swapdisclosure.sgcib.com afin de
> recueillir d'importantes informations sur les produits derives.
> *************************************************************************
>

Re: planning a cluster

Posted by Chris Mawata <ch...@gmail.com>.
If you plan to use it to learn how to program for Hadoop then pseudo
distributed (cluster of 1) will do. If you plan to use it to learn how to
administer a cluster then 4 or 5 nodes will allow experiments with
commissioning and decommissioning nodes, HA, Journaling, etc. If it is a
proof of concept cluster then it depends on the nature of the problem(s)
you want to solve.
On Jul 22, 2014 11:15 AM, "Adaryl "Bob" Wakefield, MBA" <
adaryl.wakefield@hotmail.com> wrote:

>   Someone contacted me directly and suggested the book Hadoop Operations
> by Eric Sammer.
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
>
>  *From:* YIMEN YIMGA Gael <ga...@sgcib.com>
> *Sent:* Tuesday, July 22, 2014 9:48 AM
> *To:* user@hadoop.apache.org
> *Subject:* RE: planning a cluster
>
>
> Hello,
>
>
>
> I can share a clue that i used to fix this.
>
>
>
> If you could calculate the number of nodes that you’ll need after a year,
> then you should make at the startup, a cluster with that number of node. J
>
>
>
> Warm regards
>
>
>
> *From:* Devaraj K [mailto:devaraj@apache.org]
> *Sent:* Tuesday 22 July 2014 16:46
> *To:* user@hadoop.apache.org
> *Subject:* Re: planning a cluster
>
>
>
> You may need to consider these things while choosing no of nodes for your
> cluster.
>
>
>
> 1. Data storage: how much data you are going to store in the cluster
>
> 2. Data processing : what is the processing you are going to do in the
> cluster
>
> 3. Each node hardware configurations
>
>
>
> On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
> What is the rule for determining how many nodes should be in your initial
> cluster?
>
> B.
>
>
>
>
>
> --
>
>
>
>
>
> Thanks
>
> Devaraj K
>
> *************************************************************************
> This message and any attachments (the "message") are confidential,
> intended solely for the addressee(s), and may contain legally privileged
> information.
> Any unauthorised use or dissemination is prohibited. E-mails are
> susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall
> be liable for the message if altered, changed or
> falsified.
> Please visit http://swapdisclosure.sgcib.com for important information
> with respect to derivative products.
>                               ************
> Ce message et toutes les pieces jointes (ci-apres le "message") sont
> confidentiels et susceptibles de contenir des informations couvertes
> par le secret professionnel.
> Ce message est etabli a l'intention exclusive de ses destinataires. Toute
> utilisation ou diffusion non autorisee est interdite.
> Tout message electronique est susceptible d'alteration.
> La SOCIETE GENERALE et ses filiales declinent toute responsabilite au
> titre de ce message s'il a ete altere, deforme ou falsifie.
> Veuillez consulter le site http://swapdisclosure.sgcib.com afin de
> recueillir d'importantes informations sur les produits derives.
> *************************************************************************
>

Re: planning a cluster

Posted by Chris Mawata <ch...@gmail.com>.
If you plan to use it to learn how to program for Hadoop then pseudo
distributed (cluster of 1) will do. If you plan to use it to learn how to
administer a cluster then 4 or 5 nodes will allow experiments with
commissioning and decommissioning nodes, HA, Journaling, etc. If it is a
proof of concept cluster then it depends on the nature of the problem(s)
you want to solve.
On Jul 22, 2014 11:15 AM, "Adaryl "Bob" Wakefield, MBA" <
adaryl.wakefield@hotmail.com> wrote:

>   Someone contacted me directly and suggested the book Hadoop Operations
> by Eric Sammer.
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
>
>  *From:* YIMEN YIMGA Gael <ga...@sgcib.com>
> *Sent:* Tuesday, July 22, 2014 9:48 AM
> *To:* user@hadoop.apache.org
> *Subject:* RE: planning a cluster
>
>
> Hello,
>
>
>
> I can share a clue that i used to fix this.
>
>
>
> If you could calculate the number of nodes that you’ll need after a year,
> then you should make at the startup, a cluster with that number of node. J
>
>
>
> Warm regards
>
>
>
> *From:* Devaraj K [mailto:devaraj@apache.org]
> *Sent:* Tuesday 22 July 2014 16:46
> *To:* user@hadoop.apache.org
> *Subject:* Re: planning a cluster
>
>
>
> You may need to consider these things while choosing no of nodes for your
> cluster.
>
>
>
> 1. Data storage: how much data you are going to store in the cluster
>
> 2. Data processing : what is the processing you are going to do in the
> cluster
>
> 3. Each node hardware configurations
>
>
>
> On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
> What is the rule for determining how many nodes should be in your initial
> cluster?
>
> B.
>
>
>
>
>
> --
>
>
>
>
>
> Thanks
>
> Devaraj K
>
> *************************************************************************
> This message and any attachments (the "message") are confidential,
> intended solely for the addressee(s), and may contain legally privileged
> information.
> Any unauthorised use or dissemination is prohibited. E-mails are
> susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall
> be liable for the message if altered, changed or
> falsified.
> Please visit http://swapdisclosure.sgcib.com for important information
> with respect to derivative products.
>                               ************
> Ce message et toutes les pieces jointes (ci-apres le "message") sont
> confidentiels et susceptibles de contenir des informations couvertes
> par le secret professionnel.
> Ce message est etabli a l'intention exclusive de ses destinataires. Toute
> utilisation ou diffusion non autorisee est interdite.
> Tout message electronique est susceptible d'alteration.
> La SOCIETE GENERALE et ses filiales declinent toute responsabilite au
> titre de ce message s'il a ete altere, deforme ou falsifie.
> Veuillez consulter le site http://swapdisclosure.sgcib.com afin de
> recueillir d'importantes informations sur les produits derives.
> *************************************************************************
>

Re: planning a cluster

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Someone contacted me directly and suggested the book Hadoop Operations by Eric Sammer.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba

From: YIMEN YIMGA Gael 
Sent: Tuesday, July 22, 2014 9:48 AM
To: user@hadoop.apache.org 
Subject: RE: planning a cluster

Hello,

 

I can share a clue that i used to fix this.

 

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. J

 

Warm regards

 

From: Devaraj K [mailto:devaraj@apache.org] 
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

 

You may need to consider these things while choosing no of nodes for your cluster.

 

1. Data storage: how much data you are going to store in the cluster

2. Data processing : what is the processing you are going to do in the cluster

3. Each node hardware configurations

 

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

What is the rule for determining how many nodes should be in your initial cluster?

B.





 

-- 

 

 

Thanks

Devaraj K

*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.  
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

Re: planning a cluster

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Someone contacted me directly and suggested the book Hadoop Operations by Eric Sammer.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba

From: YIMEN YIMGA Gael 
Sent: Tuesday, July 22, 2014 9:48 AM
To: user@hadoop.apache.org 
Subject: RE: planning a cluster

Hello,

 

I can share a clue that i used to fix this.

 

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. J

 

Warm regards

 

From: Devaraj K [mailto:devaraj@apache.org] 
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

 

You may need to consider these things while choosing no of nodes for your cluster.

 

1. Data storage: how much data you are going to store in the cluster

2. Data processing : what is the processing you are going to do in the cluster

3. Each node hardware configurations

 

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

What is the rule for determining how many nodes should be in your initial cluster?

B.





 

-- 

 

 

Thanks

Devaraj K

*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.  
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

Re: planning a cluster

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Someone contacted me directly and suggested the book Hadoop Operations by Eric Sammer.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba

From: YIMEN YIMGA Gael 
Sent: Tuesday, July 22, 2014 9:48 AM
To: user@hadoop.apache.org 
Subject: RE: planning a cluster

Hello,

 

I can share a clue that i used to fix this.

 

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. J

 

Warm regards

 

From: Devaraj K [mailto:devaraj@apache.org] 
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

 

You may need to consider these things while choosing no of nodes for your cluster.

 

1. Data storage: how much data you are going to store in the cluster

2. Data processing : what is the processing you are going to do in the cluster

3. Each node hardware configurations

 

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

What is the rule for determining how many nodes should be in your initial cluster?

B.





 

-- 

 

 

Thanks

Devaraj K

*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.  
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

Re: planning a cluster

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Someone contacted me directly and suggested the book Hadoop Operations by Eric Sammer.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba

From: YIMEN YIMGA Gael 
Sent: Tuesday, July 22, 2014 9:48 AM
To: user@hadoop.apache.org 
Subject: RE: planning a cluster

Hello,

 

I can share a clue that i used to fix this.

 

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. J

 

Warm regards

 

From: Devaraj K [mailto:devaraj@apache.org] 
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

 

You may need to consider these things while choosing no of nodes for your cluster.

 

1. Data storage: how much data you are going to store in the cluster

2. Data processing : what is the processing you are going to do in the cluster

3. Each node hardware configurations

 

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

What is the rule for determining how many nodes should be in your initial cluster?

B.





 

-- 

 

 

Thanks

Devaraj K

*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.  
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

RE: planning a cluster

Posted by YIMEN YIMGA Gael <ga...@sgcib.com>.
Hello,

I can share a clue that i used to fix this.

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. ☺

Warm regards

From: Devaraj K [mailto:devaraj@apache.org]
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

You may need to consider these things while choosing no of nodes for your cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the cluster
3. Each node hardware configurations

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com>> wrote:
What is the rule for determining how many nodes should be in your initial cluster?
B.



--


Thanks
Devaraj K
*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.   
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

RE: planning a cluster

Posted by YIMEN YIMGA Gael <ga...@sgcib.com>.
Hello,

I can share a clue that i used to fix this.

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. ☺

Warm regards

From: Devaraj K [mailto:devaraj@apache.org]
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

You may need to consider these things while choosing no of nodes for your cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the cluster
3. Each node hardware configurations

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com>> wrote:
What is the rule for determining how many nodes should be in your initial cluster?
B.



--


Thanks
Devaraj K
*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.   
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

RE: planning a cluster

Posted by YIMEN YIMGA Gael <ga...@sgcib.com>.
Hello,

I can share a clue that i used to fix this.

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. ☺

Warm regards

From: Devaraj K [mailto:devaraj@apache.org]
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

You may need to consider these things while choosing no of nodes for your cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the cluster
3. Each node hardware configurations

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com>> wrote:
What is the rule for determining how many nodes should be in your initial cluster?
B.



--


Thanks
Devaraj K
*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.   
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

RE: planning a cluster

Posted by YIMEN YIMGA Gael <ga...@sgcib.com>.
Hello,

I can share a clue that i used to fix this.

If you could calculate the number of nodes that you’ll need after a year, then you should make at the startup, a cluster with that number of node. ☺

Warm regards

From: Devaraj K [mailto:devaraj@apache.org]
Sent: Tuesday 22 July 2014 16:46
To: user@hadoop.apache.org
Subject: Re: planning a cluster

You may need to consider these things while choosing no of nodes for your cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the cluster
3. Each node hardware configurations

On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com>> wrote:
What is the rule for determining how many nodes should be in your initial cluster?
B.



--


Thanks
Devaraj K
*************************************************************************
This message and any attachments (the "message") are confidential, intended solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration.   
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or
falsified.
Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products.
                              ************
Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie.
Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives.
*************************************************************************

Re: planning a cluster

Posted by Devaraj K <de...@apache.org>.
You may need to consider these things while choosing no of nodes for your
cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the
cluster
3. Each node hardware configurations


On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   What is the rule for determining how many nodes should be in your
> initial cluster?
> B.
>



-- 


Thanks
Devaraj K

Re: planning a cluster

Posted by Devaraj K <de...@apache.org>.
You may need to consider these things while choosing no of nodes for your
cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the
cluster
3. Each node hardware configurations


On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   What is the rule for determining how many nodes should be in your
> initial cluster?
> B.
>



-- 


Thanks
Devaraj K

Re: planning a cluster

Posted by Devaraj K <de...@apache.org>.
You may need to consider these things while choosing no of nodes for your
cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the
cluster
3. Each node hardware configurations


On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   What is the rule for determining how many nodes should be in your
> initial cluster?
> B.
>



-- 


Thanks
Devaraj K

Re: planning a cluster

Posted by Devaraj K <de...@apache.org>.
You may need to consider these things while choosing no of nodes for your
cluster.

1. Data storage: how much data you are going to store in the cluster
2. Data processing : what is the processing you are going to do in the
cluster
3. Each node hardware configurations


On Mon, Jul 21, 2014 at 11:50 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   What is the rule for determining how many nodes should be in your
> initial cluster?
> B.
>



-- 


Thanks
Devaraj K