You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Nitay Joffe (JIRA)" <ji...@apache.org> on 2009/07/02 00:58:47 UTC
[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Attachment: zookeeper-r790255-hbase-1329-hbase-1551.jar
                zookeeper-edits.patch
                hbase-1551.patch

Here is first stab at this. It could probably use some cleaning, but should show how things will end up working.

The idea is to have bin/hbase-zookeepers.sh (which is like hbase-daemons.sh for zookeeper) call out to ZKServerTool which reads conf/zookeepers and conf/zoo.cfg to get the list of ZooKeeper servers in the quorum. bin/hbase-zookeepers.sh then starts an HQuorumPeer on each of those servers. The HQuorumPeer finds out which server it is in the list and writes the myid file.

Note that I had to make some minor changes to ZooKeeper. I've attached the edits to their code along with a Jar that contains the edited version.

There is a case missing that I still need to work out. The HQuorumPeer finds out its myid by comparing its hostname to zoo.cfg, but it needs to read conf/zookeepers too.

As you can probably tell, this is all a big huge mess. While working on it, I was thinking what if we just got rid of zoo.cfg all together. ZooKeeper out of the box can't read it anyways because we are injecting things from hbase-site.xml. I think we can put all of the ZK config options in hbase-site.xml and put the list of quorum servers in conf/zookeepers. Anyone who wants to find the quorum would assemble the host:port list from conf/zookeepers and hbase-site.xml.

I chatted about this on IRC with Chris Wensel. He makes a good point that us shipping conf/zookeepers doesn't seem right and we shoud write the ZK quorum host:port property to hbase-site.xml. I agree with him, but want to prevent putting it in two places.

Andrew, I'm interested in your thoughts on the matter as I know you need this JIRA for your stuff.
 
Here's the relevant conversation:

{code}
[15:28]  <St^Ack> nitay: what about max connections and stuff like that?
[15:29]  <nitay> St^Ack, i'd move it all to hbase-site.xml
[15:29]  <nitay> generate a java "Properties" from the right options
[15:29]  <nitay> and feed that to ZK
[15:29]  <nitay> right now its zoo.cfg => Properties => ZK Config => ZK
[15:29]  <cwensel> nitay: i think you only need quorum servers.. the rest isn't used by the client (could be wrong)
[15:29]  <nitay> instead it'd be HBaseConfiguration => ZK Properties => ZK Config...
[15:30]  <nitay> cwensel, clientPort and some of the tick stuff i think may be used
[15:31]  <nitay> i have the tool to read conf/zookeepers or conf/zoo.cfg for server list
[15:31]  <nitay> but its just ugly
[15:31]  <nitay> having ot do that
[15:31]  <nitay> and then in hbase we're gonna have to inject the server list from conf/zookeepers into conf/zoo.cfg
[15:31]  <nitay> seems ugly to me
[15:31]  <cwensel> nitay: true, but you are parsing the servers and adding the ports.. should just use a client string
[15:31]  <cwensel> you can keep zoo.cfg, just don't use it for the client side
[15:31]  <cwensel> make it optional if hbase is managing the zk instance
[15:32]  <cwensel> allow it to be removed if there is a tandem zk cluster
[15:32]  <nitay> ye that works, it'd be much cleaner if client just write conf/zookeepers and we put it together with ports
[15:32]  <nitay> that part is trivial
[15:33]  <nitay> or directly puts in host:port string as u're saying
[15:33]  <cwensel> just force the user to put the string of servers in hbase-site.xml
[15:33]  <cwensel> but keep zoo.cfg to reduce friction for local test hbase servers
[15:33]  <nitay> well then its in two places, there and conf/zookeepers
[15:33]  <nitay> we want conf/zookeepers no matter what
[15:33]  <cwensel> why?
[15:33]  <nitay> b/c its consistent with conf/regionservers etc
[15:34]  <nitay> for users that have hbase managing zk
[15:34]  <cwensel> but those are only used by the base scripts, right?
[15:34]  <nitay> yes
[15:34]  <cwensel> not read by the config files
[15:34]  <nitay> well yes and no
[15:34]  <cwensel> i would create a 'client' property that points to the zk servers.. 
[15:34]  <nitay> im proposing they would be, bsaically im just saying i dont want this information in more than one place
[15:35]  <cwensel> i understand
[15:35]  <nitay> right now its going to be in zoo.cfg and conf/zookeepers
[15:35]  <nitay> u're saying it'll be in hbase-site.xml and conf/zookeepers
[15:35]  <cwensel> but hbase config shouldn't read zookeepers
[15:35]  <nitay> why not?
[15:35]  <cwensel> i wouldn't have zookeepers.. hbase managing a zk cluster is a convenience.. unlikely to happen in the wild in production
[15:35]  <nitay> hmm i dont necessarily agree
[15:36]  <nitay> i think there's lots of users that will have no idea what zk is and just want it to work
[15:36]  <cwensel> the only thing hadoop reads are .xml (default and site) files
[15:36]  <cwensel> don't couple bash script convenience to the hbase config objec
[15:36]  <cwensel> agreed on it just working..
[15:37]  <cwensel> but it will be a local psuedo cluster with only on zk instance
[15:37]  <nitay> i see your point, but is there a way we can get best of both worlds, no coupling, yet information in one place
[15:37]  <nitay> sure in local pseudo mode conf/zookeepers will be just "localhost"
[15:37]  <cwensel> default will be localhost:2181
[15:37]  <cwensel> in both xml and cfg
[15:37]  <cwensel> should work out of the box that way
[15:38]  <cwensel> unless you want to write scripts that launch tandem zk instances across the cluster (thinking that's out of scope) i would make all the zk stuff optional
[15:38]  <nitay> yes again the issue is when user wants to go fully dist they have to manually write conf/zookeepers (simple) and for no apparent reason also to hbase-site.xml
{code}

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.