You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/07/02 01:24:47 UTC
[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726267#action_12726267 ] 

Andrew Purtell commented on HBASE-1551:
---------------------------------------

Based on my understanding, the Cloudera ZK install will put a working zoo.cfg in /etc/zookeeper/conf/zoo.cfg. For the first pass, this will bring up a single-node ensemble only. HBase will need to append the ensemble details to zoo.cfg. Can be a post install step executed by the admin. Therefore we will also need to create the myid files. They have not settled on the precise location of where the data files will be located yet. 

On a Cloudera distro ZK service will be a set of external daemons from the HBase perspective, managed separately. 

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.