You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Anthony Ikeda <An...@cardlink.com.au> on 2010/06/01 04:49:00 UTC
Standard deployments
I'm in the process of configuring our machines for a HBase deployment.
Based upon the documentation I've read so far, a ZooKeeper Quorum is
required with Hadoop running (of course).
However, to what degree do I need to separate the servers?
At this point I have a total of 12 servers with the possible
configuration:
4 x Hadoop (1 Master, 3 Slaves)
4 x HBase
4 x ZooKeeper
Should the HBase be installed with the Hadoop instances?
i.e.:
8 x Hadoop and HBase (giving me 8 instances of Hadoop and HBase as
opposed to 4 of each)
4 x ZooKeeper
Or is it typical practice for HBase to be installed on an environment
separate to Hadoop?
Anthony Ikeda
Java Analyst/Programmer
Cardlink Services Limited
Level 4, 3 Rider Boulevard
Rhodes NSW 2138
Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283
**********************************************************************
This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form.
**********************************************************************
RE: Standard deployments
Posted by Anthony Ikeda <An...@cardlink.com.au>.
Wow a great response from everyone. I'm trying to document my
understanding of the configuration and installation just to get my head
around the setup.
For now I'll start with 4 servers (1 x Hadoop Master, 1 x HBase Master,
2 x Hadoop/HBase slaves) and try and get the concept right first (the
deployment document I've started has reached nearly 4 pages!)
I need to get my head around ensuring that a server is strictly a
Datanode, master, slave, namenode, tasktracker etc. Hopefully I'll have
something ready by tomorrow!
-----Original Message-----
From: Patrick Hunt [mailto:phunt@apache.org]
Sent: Tuesday, 1 June 2010 4:09 PM
To: user@hbase.apache.org
Cc: Anthony Ikeda
Subject: Re: Standard deployments
Hi Anthony, cut back to 3 ZooKeeper servers in the ZK ensemble. The
quorum uses "majority rule", so 4 servers is actually worse than 3
(typically you would go to 5 servers as the next step up from 3).
Patrick
On 05/31/2010 07:49 PM, Anthony Ikeda wrote:
> I'm in the process of configuring our machines for a HBase deployment.
> Based upon the documentation I've read so far, a ZooKeeper Quorum is
> required with Hadoop running (of course).
>
> However, to what degree do I need to separate the servers?
>
> At this point I have a total of 12 servers with the possible
configuration:
>
> 4 x Hadoop (1 Master, 3 Slaves)
>
> 4 x HBase
>
> 4 x ZooKeeper
>
> Should the HBase be installed with the Hadoop instances?
>
> i.e.:
>
> 8 x Hadoop and HBase (giving me 8 instances of Hadoop and HBase as
> opposed to 4 of each)
>
> 4 x ZooKeeper
>
> Or is it typical practice for HBase to be installed on an environment
> separate to Hadoop?
>
> Anthony Ikeda
>
> Java Analyst/Programmer
>
> Cardlink Services Limited
>
> Level 4, 3 Rider Boulevard
>
> Rhodes NSW 2138
>
> Web: www.cardlink.com.au <http://www.cardlink.com.au> | Tel: + 61 2
9646
> 9221 | Fax: + 61 2 9646 9283
>
> logo_cardlink1
>
>
> **********************************************************************
> This e-mail message and any attachments are intended only for the use
of
> the addressee(s) named above and may contain information that is
> privileged and confidential. If you are not the intended recipient,
any
> display, dissemination, distribution, or copying is strictly
prohibited.
> If you believe you have received this e-mail message in error, please
> immediately notify the sender by replying to this e-mail message or by
> telephone to (02) 9646 9222. Please delete the email and any
attachments
> and do not retain the email or any attachments in any form.
> **********************************************************************
_____________________________________________________________________
This e-mail has been scanned for viruses by MCI's Internet Managed
Scanning Services - powered by MessageLabs. For further information
visit http://www.mci.com
**********************************************************************
This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form.
**********************************************************************
Re: Standard deployments
Posted by Patrick Hunt <ph...@apache.org>.
Hi Anthony, cut back to 3 ZooKeeper servers in the ZK ensemble. The
quorum uses "majority rule", so 4 servers is actually worse than 3
(typically you would go to 5 servers as the next step up from 3).
Patrick
On 05/31/2010 07:49 PM, Anthony Ikeda wrote:
> I’m in the process of configuring our machines for a HBase deployment.
> Based upon the documentation I’ve read so far, a ZooKeeper Quorum is
> required with Hadoop running (of course).
>
> However, to what degree do I need to separate the servers?
>
> At this point I have a total of 12 servers with the possible configuration:
>
> 4 x Hadoop (1 Master, 3 Slaves)
>
> 4 x HBase
>
> 4 x ZooKeeper
>
> Should the HBase be installed with the Hadoop instances?
>
> i.e.:
>
> 8 x Hadoop and HBase (giving me 8 instances of Hadoop and HBase as
> opposed to 4 of each)
>
> 4 x ZooKeeper
>
> Or is it typical practice for HBase to be installed on an environment
> separate to Hadoop?
>
> Anthony Ikeda
>
> Java Analyst/Programmer
>
> Cardlink Services Limited
>
> Level 4, 3 Rider Boulevard
>
> Rhodes NSW 2138
>
> Web: www.cardlink.com.au <http://www.cardlink.com.au> | Tel: + 61 2 9646
> 9221 | Fax: + 61 2 9646 9283
>
> logo_cardlink1
>
>
> **********************************************************************
> This e-mail message and any attachments are intended only for the use of
> the addressee(s) named above and may contain information that is
> privileged and confidential. If you are not the intended recipient, any
> display, dissemination, distribution, or copying is strictly prohibited.
> If you believe you have received this e-mail message in error, please
> immediately notify the sender by replying to this e-mail message or by
> telephone to (02) 9646 9222. Please delete the email and any attachments
> and do not retain the email or any attachments in any form.
> **********************************************************************
Re: Standard deployments
Posted by Todd Lipcon <to...@cloudera.com>.
Hi Anthony,
For clusters of this size, I would recommend:
10 nodes with DN, TT, and RegionServer
1 node with HMaster, JT, NN, ZK
1 node with HMaster, 2NN, NFS export for second copy of NN metadata
The various master nodes are all very low resource consumption on small
clusters, so you can safely colocate them with ZooKeeper. Running a
multinode ZK quorum doesn't really buy you anything on a small cluster -- it
doesn't need a lot of resources, and having high availability doesn't help
you much when the rest of the services on that machine aren't HA.
While at first it sounds like a bad idea to put all your eggs in one basket,
ops teams are very good at protecting egg-filled baskets, and it's easier to
protect one than protect several. With quality hardware and dual power
supplies on distinct PDUs, MTBF should be >2years.
-Todd
On Mon, May 31, 2010 at 7:49 PM, Anthony Ikeda <
Anthony.Ikeda@cardlink.com.au> wrote:
> I’m in the process of configuring our machines for a HBase deployment.
> Based upon the documentation I’ve read so far, a ZooKeeper Quorum is
> required with Hadoop running (of course).
>
>
>
> However, to what degree do I need to separate the servers?
>
>
>
> At this point I have a total of 12 servers with the possible configuration:
>
>
>
> 4 x Hadoop (1 Master, 3 Slaves)
>
> 4 x HBase
>
> 4 x ZooKeeper
>
>
>
> Should the HBase be installed with the Hadoop instances?
>
> i.e.:
>
> 8 x Hadoop and HBase (giving me 8 instances of Hadoop and HBase as opposed
> to 4 of each)
>
> 4 x ZooKeeper
>
>
>
> Or is it typical practice for HBase to be installed on an environment
> separate to Hadoop?
>
>
>
>
>
> Anthony Ikeda
>
> Java Analyst/Programmer
>
> Cardlink Services Limited
>
> Level 4, 3 Rider Boulevard
>
> Rhodes NSW 2138
>
>
>
> Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283
>
> [image: logo_cardlink1]
>
>
>
> **********************************************************************
> This e-mail message and any attachments are intended only for the use of
> the addressee(s) named above and may contain information that is privileged
> and confidential. If you are not the intended recipient, any display,
> dissemination, distribution, or copying is strictly prohibited. If you
> believe you have received this e-mail message in error, please immediately
> notify the sender by replying to this e-mail message or by telephone to (02)
> 9646 9222. Please delete the email and any attachments and do not retain the
> email or any attachments in any form.
> **********************************************************************
>
--
Todd Lipcon
Software Engineer, Cloudera
Re: Standard deployments
Posted by Ryan Rawson <ry...@gmail.com>.
It is typical to install HBase overlapping a Hadoop/HDFS installation. So
you would best to do:
- master node runs:
-- namenode, hbase master, zookeeper, jobtracker (map reduce master)
- slave nodes runs:
-- datanode, regionserver, tasktracker
better to do this with 1 master 7 slaves than to segregate the hosts. You
end up sharing resources better and evenly.
-ryan
On Mon, May 31, 2010 at 7:49 PM, Anthony Ikeda <
Anthony.Ikeda@cardlink.com.au> wrote:
> I’m in the process of configuring our machines for a HBase deployment.
> Based upon the documentation I’ve read so far, a ZooKeeper Quorum is
> required with Hadoop running (of course).
>
>
>
> However, to what degree do I need to separate the servers?
>
>
>
> At this point I have a total of 12 servers with the possible configuration:
>
>
>
> 4 x Hadoop (1 Master, 3 Slaves)
>
> 4 x HBase
>
> 4 x ZooKeeper
>
>
>
> Should the HBase be installed with the Hadoop instances?
>
> i.e.:
>
> 8 x Hadoop and HBase (giving me 8 instances of Hadoop and HBase as opposed
> to 4 of each)
>
> 4 x ZooKeeper
>
>
>
> Or is it typical practice for HBase to be installed on an environment
> separate to Hadoop?
>
>
>
>
>
> Anthony Ikeda
>
> Java Analyst/Programmer
>
> Cardlink Services Limited
>
> Level 4, 3 Rider Boulevard
>
> Rhodes NSW 2138
>
>
>
> Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283
>
> [image: logo_cardlink1]
>
>
>
> **********************************************************************
> This e-mail message and any attachments are intended only for the use of
> the addressee(s) named above and may contain information that is privileged
> and confidential. If you are not the intended recipient, any display,
> dissemination, distribution, or copying is strictly prohibited. If you
> believe you have received this e-mail message in error, please immediately
> notify the sender by replying to this e-mail message or by telephone to (02)
> 9646 9222. Please delete the email and any attachments and do not retain the
> email or any attachments in any form.
> **********************************************************************
>