You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Nitay Joffe (JIRA)" <ji...@apache.org> on 2009/06/19 22:48:07 UTC

[jira] Created: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

HBase should manage multiple node ZooKeeper quorum
--------------------------------------------------

                 Key: HBASE-1551
                 URL: https://issues.apache.org/jira/browse/HBASE-1551
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: Nitay Joffe
            Assignee: Nitay Joffe
             Fix For: 0.20.0


I thought there was already a JIRA for this, but I cannot seem to find it.

We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.

Here's relevant IRC conversation with Ryan and Andrew:

{code}
Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
Jun 17 18:15:12 <apurtell>	q ~= 5
Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
Jun 17 18:15:23 <apurtell>	if possible
Jun 17 18:15:29 <dj_ryan>	i can agree with that
Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
Jun 17 18:16:38 <dj_ryan>	- set up myids
Jun 17 18:16:38 <nitay>	what are they doing for that
Jun 17 18:16:40 <dj_ryan>	- start zk
Jun 17 18:16:42 <dj_ryan>	- stop zk
Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
Jun 17 18:16:51 <nitay>	ye stupid myids
Jun 17 18:16:52 <dj_ryan>	you start it once
Jun 17 18:16:54 <dj_ryan>	and be done with ti
Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
Jun 17 18:17:43 <dj_ryan>	yes
Jun 17 18:17:58 <nitay>	ok
Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
Jun 17 18:18:58 <apurtell>	nitay: even better
Jun 17 18:18:59 <nitay>	but we can use first five RS too
Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
Jun 17 18:19:42 <nitay>	gotta do one or the other
Jun 17 18:19:49 <nitay>	dont want to have to edit both
Jun 17 18:19:54 <apurtell>	nitay: right
Jun 17 18:20:21 <apurtell>	well...
Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
Jun 17 18:20:54 <dj_ryan>	dir
Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
Jun 17 18:24:50 <apurtell>	make sense?
Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
Jun 17 18:25:50 <apurtell>	nitay: yes
Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
Jun 17 18:25:57 <apurtell>	nitay: agree
Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
Jun 17 18:26:07 <apurtell>	... somehow
Jun 17 18:26:10 <nitay>	right :)
Jun 17 18:26:15 <apurtell>	ok, thanks
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726658#action_12726658 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

As long as both can talk to it, yes it can. All you have to change is the property zookeeper.znode.parent, which is the base directory where HBase stores its data in ZooKeeper.
It defaults to "/hbase". You can set one cluster to e.g. "/hbase1", and the other to "/hbase2" and they will not interfere with each other.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551-v2.patch, hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726656#action_12726656 ] 

Chris K Wensel commented on HBASE-1551:
---------------------------------------

this begs the question can a zk cluster be shared across multiple hbase clusters.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727132#action_12727132 ] 

stack commented on HBASE-1551:
------------------------------

Nitay, do we really need bin/hbase-zookeepers?  We can't make bin/hbase-daemons.sh work switching off whats been asked to start?  There is little difference in the scripts (especially when you don't change usage string -- smile).

Fix usage in zookeepers.sh (says zookeeper instead of zookeepers).

In HQP change name of method from 'run' to something else (because 'run' usually associated with Thread).  Same in ZKST.

Patch has binary for the new zk jar.  Maybe leave out in future.

Otherwise, patch looks great.  Lacks documentation but that'll be in different issue?

What should I test?









> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Attachment:     (was: zookeeper-r790255-hbase-1329-hbase-1551.jar)

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726267#action_12726267 ] 

Andrew Purtell commented on HBASE-1551:
---------------------------------------

Based on my understanding, the Cloudera ZK install will put a working zoo.cfg in /etc/zookeeper/conf/zoo.cfg. For the first pass, this will bring up a single-node ensemble only. HBase will need to append the ensemble details to zoo.cfg. Can be a post install step executed by the admin. Therefore we will also need to create the myid files. They have not settled on the precise location of where the data files will be located yet. 

On a Cloudera distro ZK service will be a set of external daemons from the HBase perspective, managed separately. 

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Status: Patch Available  (was: In Progress)

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551-v2.patch, hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726525#action_12726525 ] 

Chris K Wensel commented on HBASE-1551:
---------------------------------------


bq. @stack/cwensel, hbase management of ZK is in a bash/env thing. There's a setting in hbase-env.sh called HBASE_MANAGES_ZK which toggles whether HBase will start/stop ZK.

does this mean we are storing the value in two places?

{noformat}
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
  <description>The mode the cluster will be in. Possible values are
    false: standalone and pseudo-distributed setups with managed Zookeeper
    true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
  </description>
</property>
{noformat}


> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729385#action_12729385 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

Committed. Changed HADOOP stuff to HBASE.

List of ZooKeeper servers comes from zoo.cfg, if there is one in the classpath, or hbase-site.xml.

If you want a list of ZooKeeper hosts you can run bin/hbase org.apache.hadoop.hbase.zookeeper.ZKServerTool.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551-v2.patch, hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726574#action_12726574 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

No. The are separate things. hbase.cluster.distributed is for switching between local/pseudo mode and distributed.
It is the same as what setting hbase.master to "local" used to be.
HBASE_MANAGES_ZK is for HBase to start/stop ZK.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729383#action_12729383 ] 

stack commented on HBASE-1551:
------------------------------

Go ahead commit Nitay.  We can file issues with it as we find them.  I don't have resources to hand to test this at mo.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551-v2.patch, hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726270#action_12726270 ] 

Chris K Wensel commented on HBASE-1551:
---------------------------------------

there are three issues here:

 - hbase as client to zk
 - deployment of zk cluster
 - zk server configuration

it is a hard requirement that hbase know who its zk quorum is as a consumer of zk.

it is optional that hbase scripts manage a zk cluster (though a cluster is required somewhere). 

and if hbase is managing the zk cluster, zk finds its zoo.cfg

it would be "nice" (if not awkward) if hbase zk client looked for zoo.cfg if there wasn't already list of quorum servers in hbase-site.xml

if hbase took responsibility for deploying/managing zk servers across a cluster, then bin 'conf/zookeepers' should be maintained. should be noted zk is very ec2 unfriendly, having to know all parties at provisioning. leads us to want a standalone zk cluster shared across tandem clusters.

i opened hbase-1600 so we can have multiple hbase client instances talking to distinct hbase clusters from within a single jvm.

also thinking that turning on/off zk management by hbase should be a bash/env config, not a hbase-site.xml config, hbase.cluster.distributed


> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722005#action_12722005 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

This is a bit trickier than a simple set of bash scripts because we need to parse zoo.cfg (and put in values from hbase-site.xml).
I'm writing a java tool to do this that the shell scripts can call to. Let me know if you guys have other ideas. 

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Work started: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-1551 started by Nitay Joffe.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726349#action_12726349 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

@andrew, see ZOOKEEPER-107 and possibly ZOOKEEPER-29.

@stack/cwensel, hbase management of ZK is in a bash/env thing. There's a setting in hbase-env.sh called HBASE_MANAGES_ZK which toggles whether HBase will start/stop ZK.

@stack/andrew,cwensel, Yes the iterator thing along the line of what I was thinking of.
Here's my current thinking:
- Move all of the ZooKeeper config paraments into hbase-*.xml using zookeeper.property.KEY = VALUE.
- Add a special property for the list of quorum servers, say zookeeper.quorum. This option can default to "localhost".
- If there is a zoo.cfg present in the classpath, use its data above the zookeeper.property.KEY options.
- When we need to instantiate something to talk to ZooKeeper, we simply create a new HBaseConfiguration and call some method on it e.g. toZooKeeperProperties().
This method will iterate through the zookeeper.property.KEY and turn each into the appropriate ZooKeeper configurations (i.e. KEY=VALUE). It will generate
the server.X property from the zookeeper.quorum configuration option. As mentioned above, if there is a zoo.cfg in the classpath, overwrite the data with its configuration.
This will return a Properties object that can be used to construct the appropriate ZooKeeper config and start/talk to their servers.
- For start/stop management of full ZK quorum cluster, use something like my ZKServerTool in the patch (modified of course) to do the parsing mentioned above and turn it
into a simple line-by-line list of quorum servers. As I do in this patch, the bin/zookeepers.sh can then simply call bin/hbase o.a.h.h.z.ZKServerTool to get the list of hosts.
If you want something like a conf/zookeepers you can simply run ZKServerTool yourself.

The benefits from all this are:
- One place for all ZK configuration. No duplicate setting of parameters.
- No more nasty zoo.cfg. Give the user what they're already used to, a single XML config file.
- New user only need edit zookeeper.quorum to get full cluster.
- Programmable control of what ZK one is talking to.

That's all I can think of for now.

Thoughts?

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726294#action_12726294 ] 

Andrew Purtell commented on HBASE-1551:
---------------------------------------

Anybody know offhand of ZK jira(s) for dynamic quorum peer discovery and addition/removal? 

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722020#action_12722020 ] 

Andrew Purtell commented on HBASE-1551:
---------------------------------------

@Nitay: We can still do this privately for convenience of users who want HBase to manage ZK on their behalf, but for any Cloudera packaging it has been decided we will unbundle ZK and just pick up the configuration produced by their ZK package. Otherwise I think this issue could be resolved as Later. 

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729064#action_12729064 ] 

stack commented on HBASE-1551:
------------------------------

@nitay

Talks about HADOOP env variables up in the comments but down in code in uses HBASE ones (in zookeepers.sh).

So no file with list of hosts to start zookeeper on?  Just read from hbase-site.xml?

Looks good Nitay.  Didn't test.  Will try it in morning.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551-v2.patch, hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722024#action_12722024 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

Right, thanks for the info. I'll keep playing with this. If nothing else it will get rid of my own scripts to do such things :).

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726619#action_12726619 ] 

Nitay Joffe commented on HBASE-1551:
------------------------------------

@stack, Okay, all the points you make are good and easily implementable. I'll open a separate JIRA for this and link this one.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726311#action_12726311 ] 

stack commented on HBASE-1551:
------------------------------

@nitay you have jruby available so if you wanted to write parse in ruby, its available to you.

@nitay, in HBaseConfiguration, there is http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/conf/Configuration.html#iterator().  You could iterate all properties in the Configuration and use any that have a hbase.zookeeper prefix writing configuration for zk.  This would give you a non-hardcoded system for adding zk config. to hbase-site.xml

@nitay, if we did something like above, would it be possible for a client to do HBaseConfiguration.set("hbase.zookeeper.... quorum", blah, blah) and THEN connect to zk quorum or would that be too late?  When does client try to connect to quorum?  On class loading or when you create a HTable instance?

.bq also thinking that turning on/off zk management by hbase should be a bash/env config, not a hbase-site.xml config, hbase.cluster.distributed

This seems like a good idea

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Attachment: hbase-1551-v2.patch

Second version of patch.

- Removed hbase-zookeepers.sh as Stack suggested and replaced with switch in hbase-daemons.sh.
- Fix method names.

I'll open a JIRA for documenting ZooKeeper management in HBase.

I tested starting/stopping full ZK cluster, and using existing ZK quorum. I tested it with configuration in hbase-site.xml, and in zoo.cfg. Can't think of anything else to test currently.

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551-v2.patch, hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727130#action_12727130 ] 

stack commented on HBASE-1551:
------------------------------

.bq ...but i'm wondering if hbase's responsibility for zk stops there since it is not embedded into hbase, but is external.

Yeah, we wondered same and tried to do it that way initially.  But we then fellas wanted to programmatically set zk cluster location and zoo.cfg was becoming a pain.  This patch probably err's on providing too much zk management but I think it fair to say we are here because nitay and j-d experiments up to this have not been satisfactory (is that fair Nitay/J-D?)

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Attachment:     (was: zookeeper-edits.patch)

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726617#action_12726617 ] 

stack commented on HBASE-1551:
------------------------------

@nitay on hbase-env.sh, yeah, I was thinking the value in cwensel's point was that we have distributed spanning two configs.  You cleared up my misconception.

@nitay, I think you need to keep the hbase zookeeper config inside of an hbase namespace.  The Hadoop Configuation system is a floozy.  It will go with anyone who calls load resource on it pulling in their properties.  I could see that out on a MR task, the Configuration could have all kinds of pollution in it.  Would suggest an hbase prefix -- hbase.zookeeper prefix?

@nitay on "if zoo.cfg in the CLASSPATH", that might work.  We might want to try narrow the places we look on the CLASSPATH.  But lets start open and narrow later (we probably want this to be open as possible at mo. until we learn more about the cloudera config.)

@nitay on toZooKeeperProperties, do we have to expose that?  Can't we just pass a HBaseConfiguation to the HBase ZK Wrapper (I'm not up on latest dev here so this might be an off suggestion)



> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1551:
-------------------------------

    Attachment: zookeeper-r790255-hbase-1329-hbase-1551.jar
                zookeeper-edits.patch
                hbase-1551.patch

Here is first stab at this. It could probably use some cleaning, but should show how things will end up working.

The idea is to have bin/hbase-zookeepers.sh (which is like hbase-daemons.sh for zookeeper) call out to ZKServerTool which reads conf/zookeepers and conf/zoo.cfg to get the list of ZooKeeper servers in the quorum. bin/hbase-zookeepers.sh then starts an HQuorumPeer on each of those servers. The HQuorumPeer finds out which server it is in the list and writes the myid file.

Note that I had to make some minor changes to ZooKeeper. I've attached the edits to their code along with a Jar that contains the edited version.

There is a case missing that I still need to work out. The HQuorumPeer finds out its myid by comparing its hostname to zoo.cfg, but it needs to read conf/zookeepers too.

As you can probably tell, this is all a big huge mess. While working on it, I was thinking what if we just got rid of zoo.cfg all together. ZooKeeper out of the box can't read it anyways because we are injecting things from hbase-site.xml. I think we can put all of the ZK config options in hbase-site.xml and put the list of quorum servers in conf/zookeepers. Anyone who wants to find the quorum would assemble the host:port list from conf/zookeepers and hbase-site.xml.

I chatted about this on IRC with Chris Wensel. He makes a good point that us shipping conf/zookeepers doesn't seem right and we shoud write the ZK quorum host:port property to hbase-site.xml. I agree with him, but want to prevent putting it in two places.

Andrew, I'm interested in your thoughts on the matter as I know you need this JIRA for your stuff.
 
Here's the relevant conversation:

{code}
[15:28]  <St^Ack> nitay: what about max connections and stuff like that?
[15:29]  <nitay> St^Ack, i'd move it all to hbase-site.xml
[15:29]  <nitay> generate a java "Properties" from the right options
[15:29]  <nitay> and feed that to ZK
[15:29]  <nitay> right now its zoo.cfg => Properties => ZK Config => ZK
[15:29]  <cwensel> nitay: i think you only need quorum servers.. the rest isn't used by the client (could be wrong)
[15:29]  <nitay> instead it'd be HBaseConfiguration => ZK Properties => ZK Config...
[15:30]  <nitay> cwensel, clientPort and some of the tick stuff i think may be used
[15:31]  <nitay> i have the tool to read conf/zookeepers or conf/zoo.cfg for server list
[15:31]  <nitay> but its just ugly
[15:31]  <nitay> having ot do that
[15:31]  <nitay> and then in hbase we're gonna have to inject the server list from conf/zookeepers into conf/zoo.cfg
[15:31]  <nitay> seems ugly to me
[15:31]  <cwensel> nitay: true, but you are parsing the servers and adding the ports.. should just use a client string
[15:31]  <cwensel> you can keep zoo.cfg, just don't use it for the client side
[15:31]  <cwensel> make it optional if hbase is managing the zk instance
[15:32]  <cwensel> allow it to be removed if there is a tandem zk cluster
[15:32]  <nitay> ye that works, it'd be much cleaner if client just write conf/zookeepers and we put it together with ports
[15:32]  <nitay> that part is trivial
[15:33]  <nitay> or directly puts in host:port string as u're saying
[15:33]  <cwensel> just force the user to put the string of servers in hbase-site.xml
[15:33]  <cwensel> but keep zoo.cfg to reduce friction for local test hbase servers
[15:33]  <nitay> well then its in two places, there and conf/zookeepers
[15:33]  <nitay> we want conf/zookeepers no matter what
[15:33]  <cwensel> why?
[15:33]  <nitay> b/c its consistent with conf/regionservers etc
[15:34]  <nitay> for users that have hbase managing zk
[15:34]  <cwensel> but those are only used by the base scripts, right?
[15:34]  <nitay> yes
[15:34]  <cwensel> not read by the config files
[15:34]  <nitay> well yes and no
[15:34]  <cwensel> i would create a 'client' property that points to the zk servers.. 
[15:34]  <nitay> im proposing they would be, bsaically im just saying i dont want this information in more than one place
[15:35]  <cwensel> i understand
[15:35]  <nitay> right now its going to be in zoo.cfg and conf/zookeepers
[15:35]  <nitay> u're saying it'll be in hbase-site.xml and conf/zookeepers
[15:35]  <cwensel> but hbase config shouldn't read zookeepers
[15:35]  <nitay> why not?
[15:35]  <cwensel> i wouldn't have zookeepers.. hbase managing a zk cluster is a convenience.. unlikely to happen in the wild in production
[15:35]  <nitay> hmm i dont necessarily agree
[15:36]  <nitay> i think there's lots of users that will have no idea what zk is and just want it to work
[15:36]  <cwensel> the only thing hadoop reads are .xml (default and site) files
[15:36]  <cwensel> don't couple bash script convenience to the hbase config objec
[15:36]  <cwensel> agreed on it just working..
[15:37]  <cwensel> but it will be a local psuedo cluster with only on zk instance
[15:37]  <nitay> i see your point, but is there a way we can get best of both worlds, no coupling, yet information in one place
[15:37]  <nitay> sure in local pseudo mode conf/zookeepers will be just "localhost"
[15:37]  <cwensel> default will be localhost:2181
[15:37]  <cwensel> in both xml and cfg
[15:37]  <cwensel> should work out of the box that way
[15:38]  <cwensel> unless you want to write scripts that launch tandem zk instances across the cluster (thinking that's out of scope) i would make all the zk stuff optional
[15:38]  <nitay> yes again the issue is when user wants to go fully dist they have to manually write conf/zookeepers (simple) and for no apparent reason also to hbase-site.xml
{code}

> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1551) HBase should manage multiple node ZooKeeper quorum

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726654#action_12726654 ] 

Chris K Wensel commented on HBASE-1551:
---------------------------------------

I keep coming back to whether or not hbase should 'manage' a zk cluster as anything more than a getting started convenience.

as a _convenience_, scripts should boot a tandem zk instance so hbase will run.

and by default, the hbase client should look at localhost for the zk quorum server (singular)

and the env should state whether or not a simple zk instance should be started on 'start' and stopped on 'stop', true by default

but i'm wondering if hbase's responsibility for zk stops there since it is not embedded into hbase, but is external.

by stopping there, zk server params don't leak into hbase config. we still have zoo.cfg, but only zk server side cares.

that is, by changing an env prop (manage zk or not), i invalidate the need for a whole set of hbase-default.xml properties for that deployment. i'm thinking hbase props should be valid regardless of the topology/env, otherwise they aren't hbase properties but env properties (for the most part).

to illustrate a bit..

initially in ec2, i'll probably only have one master, and so only one zk instance. under ec2 we provision/boot the master first to get its name (will be tricky still to bootstrap zk here). slaves are then booted with the master name embedded in the *-site.xml files. this pattern works great for developers that need a ephemeral cluster, and a 'good enough' production like cluster (to support batch jobs). 

long term, we will need to provision N hbase masters against a bespoke standalone/independent zk cluster. the zk cluster will need to be provisioned first, then zk configured so they know the quorum.

in these scenarios, we never touch start-hbase/stop-hbase or any high level bash scripts. just the core scripts to start individual services.



> HBase should manage multiple node ZooKeeper quorum
> --------------------------------------------------
>
>                 Key: HBASE-1551
>                 URL: https://issues.apache.org/jira/browse/HBASE-1551
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1551.patch, zookeeper-edits.patch, zookeeper-r790255-hbase-1329-hbase-1551.jar
>
>
> I thought there was already a JIRA for this, but I cannot seem to find it.
> We need to manage multiple node ZooKeeper quorums (required for fully distributed option) in HBase to make things easier for users.
> Here's relevant IRC conversation with Ryan and Andrew:
> {code}
> Jun 17 18:14:39 <dj_ryan>	right now we include our client deps in hbase/lib
> Jun 17 18:14:47 <dj_ryan>	so removing zookeeper would be problematic
> Jun 17 18:14:56 <dj_ryan>	but hbase does put up a private zk quorum
> Jun 17 18:15:02 <dj_ryan>	it just doesnt bother with q>1
> Jun 17 18:15:05 <apurtell>	dj_ryan, nitay: agreed, so that's why i wonder about a private zk quorum managed by hbase
> Jun 17 18:15:12 <apurtell>	q ~= 5
> Jun 17 18:15:22 <dj_ryan>	so maybe we should ship tools to manage it
> Jun 17 18:15:23 <apurtell>	if possible
> Jun 17 18:15:29 <dj_ryan>	i can agree with that
> Jun 17 18:15:39 <nitay>	apurtell, ok, i'd be happy to bump the priority of hbase managing full cluster and work on that
> Jun 17 18:15:47 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has joined #hbase
> Jun 17 18:15:48 <apurtell>	nitay: that would be awesome
> Jun 17 18:15:57 <apurtell>	then i can skip discussions with cloudera about including zk also
> Jun 17 18:16:12 <apurtell>	and we can use some private ports that won't conflict with a typical zk install
> Jun 17 18:16:15 <nitay>	but i also think that users should be able to point at existing clusters, so as long as your rpms are compatible, it should be fine
> Jun 17 18:16:23 <nitay>	apurtell, isn't hadoop going to start using ZK
> Jun 17 18:16:31 <apurtell>	nitay: agree, but this is the cloudera-autoconfig-rpm (and deb) case
> Jun 17 18:16:34 <nitay>	the cloudera dude was working on using it for namenode whatnot like we do for master
> Jun 17 18:16:35 <dj_ryan>	so there are only 2 things
> Jun 17 18:16:38 <dj_ryan>	- set up myids
> Jun 17 18:16:38 <nitay>	what are they doing for that
> Jun 17 18:16:40 <dj_ryan>	- start zk
> Jun 17 18:16:42 <dj_ryan>	- stop zk
> Jun 17 18:16:50 <dj_ryan>	we dont want to start/stop zk just when we are doing a cluster bounce
> Jun 17 18:16:51 <nitay>	ye stupid myids
> Jun 17 18:16:52 <dj_ryan>	you start it once
> Jun 17 18:16:54 <dj_ryan>	and be done with ti
> Jun 17 18:16:58 *	iand (n=iand@205.158.58.226.ptr.us.xo.net) has left #hbase ("Leaving.")
> Jun 17 18:17:13 <apurtell>	dj_ryan: yes, start it once. that's what i do. works fine through many hbase restarts...
> Jun 17 18:17:28 <nitay>	so then we need a separate shell cmd or something to stop zk
> Jun 17 18:17:35 <nitay>	and start on start-hbase if not already running type thing
> Jun 17 18:17:43 <dj_ryan>	yes
> Jun 17 18:17:58 <nitay>	ok
> Jun 17 18:18:19 <apurtell>	with quorum peers started on nodes in conf/regionservers, up to ~5 if possible
> Jun 17 18:18:37 <apurtell>	but what about zoo.cfg?
> Jun 17 18:18:51 <nitay>	oh i was thinking of having separate conf/zookeepers
> Jun 17 18:18:58 <apurtell>	nitay: even better
> Jun 17 18:18:59 <nitay>	but we can use first five RS too
> Jun 17 18:19:26 <nitay>	apurtell, yeah so really there wouldnt be a conf/zookeepers, i would rip out hostnames from zoo.cfg
> Jun 17 18:19:38 <nitay>	or go the other way, generate zoo.cfg from conf/zookeepers
> Jun 17 18:19:42 <nitay>	gotta do one or the other
> Jun 17 18:19:49 <nitay>	dont want to have to edit both
> Jun 17 18:19:54 <apurtell>	nitay: right
> Jun 17 18:20:21 <apurtell>	well...
> Jun 17 18:20:29 <nitay>	zoo.cfg has the right info right now, cause u need things other than just hostnames, i.e. client and quorum ports
> Jun 17 18:20:31 <apurtell>	we can leave out servers from our default zoo.cfg
> Jun 17 18:20:39 <apurtell>	and consider a conf/zookeepers
> Jun 17 18:20:47 <dj_ryan>	i call it conf/zoos
> Jun 17 18:20:54 <dj_ryan>	in my zookeeper config
> Jun 17 18:20:54 <dj_ryan>	dir
> Jun 17 18:20:57 <nitay>	and then have our parsing of zoo.cfg insert them
> Jun 17 18:21:08 <nitay>	cause right now its all off java Properties anyways
> Jun 17 18:21:12 <apurtell>	and let the zk wrapper parse the files if they exist and otherwise build the list of quorum peers like it does already
> Jun 17 18:21:34 <apurtell>	so someone could edit either and it would dtrt
> Jun 17 18:21:48 <nitay>	apurtell, yeah, makes sense
> Jun 17 18:21:58 <nitay>	we can discuss getting rid of zoo.cfg completely
> Jun 17 18:22:12 <nitay>	put it all in XML and just create a Properties for ZK off the right props
> Jun 17 18:22:14 <apurtell>	for my purposes, i just need some files available for a post install script to lay down a static hbase cluster config based on what it discovers about the hadoop installation
> Jun 17 18:23:56 <apurtell>	then i need to hook sysvinit and use chkconfig to enable/disable services on the cluster nodes according to their roles defined by hadoop/conf/masters and hadoop/conf/regionservers
> Jun 17 18:24:13 <apurtell>	so we put the hmaster on the namenode
> Jun 17 18:24:17 <apurtell>	and the region servers on the datanodes
> Jun 17 18:24:35 <apurtell>	hadoop/conf/slaves i mean
> Jun 17 18:24:44 <apurtell>	and pick N hosts out of slaves to host the zk quorum
> Jun 17 18:24:50 <apurtell>	make sense?
> Jun 17 18:25:33 <nitay>	yes i think so, and u'll be auto generating the hbase configs for what servers run what then?
> Jun 17 18:25:50 <apurtell>	nitay: yes
> Jun 17 18:25:51 <nitay>	which is why a simple line by line conf/zookeepers type file is clean and easy
> Jun 17 18:25:57 <apurtell>	nitay: agree
> Jun 17 18:25:59 <apurtell>	so i think my initial question has been answered, hbase will manage a private zk ensemble
> Jun 17 18:26:07 <apurtell>	... somehow
> Jun 17 18:26:10 <nitay>	right :)
> Jun 17 18:26:15 <apurtell>	ok, thanks
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.