You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by ch huang <ju...@gmail.com> on 2013/08/08 11:09:20 UTC

issue about hadoop hardware choose

hi,all:
            My company need build a 10 node hadoop cluster (2 namenode and
8 datanode & node manager ,for both data storage and data analysis ) ,we
have hbase ,hive on the hadoop cluster, 10G data increment per day.
            we use CDH4.3 ( for dual - namenode HA),my plan is

           name node  & resource manager
           dual Quad Core
         24G RAM
         2 * 500GB SATA DISK (JBOD)

         datanode & node manager
         dual Quad Core
         24G RAM
         2 * 1TGB SATA DISK (JBOD)


my question is
1, if resource manager need a dedicated server? ( i plan to put RM with one
of NN)
2, if the RAM is enough for RM + NN machine?
3,RAID is need for NN machine?
4,is it ok if i place JN on other node(DN or NN)
5, how much zookeeper server node i need?
6,i want to place yarn proxy server and mapreduce history server with
another NN,is it ok?

Re: issue about hadoop hardware choose

Posted by Azuryy Yu <az...@gmail.com>.
if you want HA, then do you want to deploy journal node on the DN?
On Aug 8, 2013 5:09 PM, "ch huang" <ju...@gmail.com> wrote:

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Mirko Kämpf <mi...@gmail.com>.
Hello Ch Huang,


Do you know this book?
"Hadoop Operations" http://shop.oreilly.com/product/0636920025085.do

I think, it answers most of the questions in detail.

For a production cluster you should consider MRv1.
And I suggest you, to go with more hard drives per slave node to have a
higher
IO bandwith for map reduce, give it 4 x 2 TB at least or even 6.
At least three zookeeper servers are used.

Best wishes
Mirko



2013/8/8 ch huang <ju...@gmail.com>

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Mirko Kämpf <mi...@gmail.com>.
Hello Ch Huang,


Do you know this book?
"Hadoop Operations" http://shop.oreilly.com/product/0636920025085.do

I think, it answers most of the questions in detail.

For a production cluster you should consider MRv1.
And I suggest you, to go with more hard drives per slave node to have a
higher
IO bandwith for map reduce, give it 4 x 2 TB at least or even 6.
At least three zookeeper servers are used.

Best wishes
Mirko



2013/8/8 ch huang <ju...@gmail.com>

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Mirko Kämpf <mi...@gmail.com>.
Hello Ch Huang,


Do you know this book?
"Hadoop Operations" http://shop.oreilly.com/product/0636920025085.do

I think, it answers most of the questions in detail.

For a production cluster you should consider MRv1.
And I suggest you, to go with more hard drives per slave node to have a
higher
IO bandwith for map reduce, give it 4 x 2 TB at least or even 6.
At least three zookeeper servers are used.

Best wishes
Mirko



2013/8/8 ch huang <ju...@gmail.com>

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Azuryy Yu <az...@gmail.com>.
if you want HA, then do you want to deploy journal node on the DN?
On Aug 8, 2013 5:09 PM, "ch huang" <ju...@gmail.com> wrote:

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Azuryy Yu <az...@gmail.com>.
if you want HA, then do you want to deploy journal node on the DN?
On Aug 8, 2013 5:09 PM, "ch huang" <ju...@gmail.com> wrote:

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Azuryy Yu <az...@gmail.com>.
if you want HA, then do you want to deploy journal node on the DN?
On Aug 8, 2013 5:09 PM, "ch huang" <ju...@gmail.com> wrote:

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>

Re: issue about hadoop hardware choose

Posted by Mirko Kämpf <mi...@gmail.com>.
Hello Ch Huang,


Do you know this book?
"Hadoop Operations" http://shop.oreilly.com/product/0636920025085.do

I think, it answers most of the questions in detail.

For a production cluster you should consider MRv1.
And I suggest you, to go with more hard drives per slave node to have a
higher
IO bandwith for map reduce, give it 4 x 2 TB at least or even 6.
At least three zookeeper servers are used.

Best wishes
Mirko



2013/8/8 ch huang <ju...@gmail.com>

> hi,all:
>             My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
>             we use CDH4.3 ( for dual - namenode HA),my plan is
>
>            name node  & resource manager
>            dual Quad Core
>          24G RAM
>          2 * 500GB SATA DISK (JBOD)
>
>          datanode & node manager
>          dual Quad Core
>          24G RAM
>          2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>