You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by fengguang gong <go...@icloud.com> on 2013/09/11 08:17:15 UTC

How dose zookeeper handle fault-detect in distributed storage system

Hi all:
	
	Recently my lab want to use zk to manager our cluster(Fault detect). Our cluster 
includes three kinds of node:
1. dispatch node : load balance and dispatch data.
2. store node: receive data from dispatch node and store.
3. middleware: query data from all the store node.
Here my question is : How dose zk handle fault-detect in this system(how dose dispatch node 
and middleware know that a store node is down).

RE: How dose zookeeper handle fault-detect in distributed storage system

Posted by Rakesh R <ra...@huawei.com>.

Hi Fengguang,

> Here my question is : How dose zk handle fault-detect in this 
> system(how dose dispatch node and middleware know that a store node is 
> down).

Adding few more, hope it will help you:)

ZooKeeper has a hierarchal structure and like a distributed file system. You will be able to create znodes, like files creating under directory.

When comes to your usecase, Store node will create a zookeeper client session and register with zookeeper(here store node can create ephemral znode in zookeepe, to show his presence).

Dispatch and middleware nodes can do watching these znodes using zookeeper watch notification mechanism. When the Store goes down, will lose zookeeper connection and the ephemeral znode will be deleted. Inturn Middleware and Dispatch would be receiving the watcher notifications. You can add your logic on these watchers.

Please see the following links to know more about the zNode and watching concepts.

http://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#sc_zkDataModel_znodes
http://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#Ephemeral+Nodes
http://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#ch_zkWatches

-Rakesh

-----Original Message-----
From: German Blanco [mailto:german.blanco.blanco@gmail.com] 
Sent: 11 September 2013 15:39
To: user@zookeeper.apache.org
Subject: Re: How dose zookeeper handle fault-detect in distributed storage system

I am not sure that I get your point.
You need a ZooKeeper ensemble with several servers (minimun recommended number is 3). One of them will be selected as a leader of the ensemble when required, but normally you don't have to worry about that.
Normally you will need to run the servers in the ZooKeeper ensemble as independent processes, and each of them should run in a different machine in order to increase redundancy. These processes could be running in any machine in your network, including I assume, the machines that host the dispatch nodes, store nodes and middleware. On top of that, ZooKeeper clients must be somehow linked with your dispatch/store/middleware nodes and manage the information stored in the ZooKeeper ensemble.

On Wed, Sep 11, 2013 at 10:11 AM, fengguang gong
<go...@icloud.com>wrote:

> Thanks very much German,
>         The second possibility of you answer will be great. But i 
> still confused about  the servers(Leader) and clients.
> Should i distinguish servers(Leader) and clients between dispatch 
> nodes, store nodes and middleware?
> Or should i just ignore all this concepts?
> 在 2013-9-11，下午3:12，German Blanco <ge...@gmail.com> 写道：
>
> > Hello Fengguang Gong,
> > I think there is more than one answer to your question.
> > One possibility would be to have each of your nodes as zookeeper 
> > clients that create an ephemeral node in the zookeeper data, and 
> > query and most likely subscribe to changes so that they are notified 
> > about the status of the ephemeral zookeeper nodes created by the 
> > rest of the nodes. If you
> are
> > only interested in dispatch and middleware nodes knowing about the 
> > status of store nodes, then you could have ephemeral zookeeper nodes 
> > created
> only
> > by the store nodes, and dispatch and middleware nodes querying and 
> > subscribing to the resulting status.
> > You will need to make sure that the events of store nodes going up 
> > and
> down
> > are reflected correctly in the creation and deletion of the 
> > zookeeper
> node.
> > You will also have to tune the heartbeat between zookeeper client 
> > and server so that it fits your requirements.
> > Does that suit you?
> > Any other options?
> > Good luck :-)
> >
> >
> > On Wed, Sep 11, 2013 at 8:17 AM, fengguang gong <
> gongfengguang@icloud.com>wrote:
> >
> >> Hi all:
> >>
> >>        Recently my lab want to use zk to manager our cluster(Fault 
> >> detect). Our cluster includes three kinds of node:
> >> 1. dispatch node : load balance and dispatch data.
> >> 2. store node: receive data from dispatch node and store.
> >> 3. middleware: query data from all the store node.
> >> Here my question is : How dose zk handle fault-detect in this 
> >> system(how dose dispatch node and middleware know that a store node 
> >> is down).
>
>

Re: How dose zookeeper handle fault-detect in distributed storage system

Posted by German Blanco <ge...@gmail.com>.

I am not sure that I get your point.
You need a ZooKeeper ensemble with several servers (minimun recommended
number is 3). One of them will be selected as a leader of the ensemble when
required, but normally you don't have to worry about that.
Normally you will need to run the servers in the ZooKeeper ensemble as
independent processes, and each of them should run in a different machine
in order to increase redundancy. These processes could be running in any
machine in your network, including I assume, the machines that host the
dispatch nodes, store nodes and middleware. On top of that, ZooKeeper
clients must be somehow linked with your dispatch/store/middleware nodes
and manage the information stored in the ZooKeeper ensemble.



On Wed, Sep 11, 2013 at 10:11 AM, fengguang gong
<go...@icloud.com>wrote:

> Thanks very much German,
>         The second possibility of you answer will be great. But i still
> confused about
>  the servers(Leader) and clients.
> Should i distinguish servers(Leader) and clients between dispatch nodes,
> store nodes and middleware?
> Or should i just ignore all this concepts?
> 在 2013-9-11，下午3:12，German Blanco <ge...@gmail.com> 写道：
>
> > Hello Fengguang Gong,
> > I think there is more than one answer to your question.
> > One possibility would be to have each of your nodes as zookeeper clients
> > that create an ephemeral node in the zookeeper data, and query and most
> > likely subscribe to changes so that they are notified about the status of
> > the ephemeral zookeeper nodes created by the rest of the nodes. If you
> are
> > only interested in dispatch and middleware nodes knowing about the status
> > of store nodes, then you could have ephemeral zookeeper nodes created
> only
> > by the store nodes, and dispatch and middleware nodes querying and
> > subscribing to the resulting status.
> > You will need to make sure that the events of store nodes going up and
> down
> > are reflected correctly in the creation and deletion of the zookeeper
> node.
> > You will also have to tune the heartbeat between zookeeper client and
> > server so that it fits your requirements.
> > Does that suit you?
> > Any other options?
> > Good luck :-)
> >
> >
> > On Wed, Sep 11, 2013 at 8:17 AM, fengguang gong <
> gongfengguang@icloud.com>wrote:
> >
> >> Hi all:
> >>
> >>        Recently my lab want to use zk to manager our cluster(Fault
> >> detect). Our cluster
> >> includes three kinds of node:
> >> 1. dispatch node : load balance and dispatch data.
> >> 2. store node: receive data from dispatch node and store.
> >> 3. middleware: query data from all the store node.
> >> Here my question is : How dose zk handle fault-detect in this system(how
> >> dose dispatch node
> >> and middleware know that a store node is down).
>
>

Re: How dose zookeeper handle fault-detect in distributed storage system

Posted by fengguang gong <go...@icloud.com>.

Thanks very much German,
	The second possibility of you answer will be great. But i still confused about
 the servers(Leader) and clients.
Should i distinguish servers(Leader) and clients between dispatch nodes, store nodes and middleware?
Or should i just ignore all this concepts?
在 2013-9-11，下午3:12，German Blanco <ge...@gmail.com> 写道：

> Hello Fengguang Gong,
> I think there is more than one answer to your question.
> One possibility would be to have each of your nodes as zookeeper clients
> that create an ephemeral node in the zookeeper data, and query and most
> likely subscribe to changes so that they are notified about the status of
> the ephemeral zookeeper nodes created by the rest of the nodes. If you are
> only interested in dispatch and middleware nodes knowing about the status
> of store nodes, then you could have ephemeral zookeeper nodes created only
> by the store nodes, and dispatch and middleware nodes querying and
> subscribing to the resulting status.
> You will need to make sure that the events of store nodes going up and down
> are reflected correctly in the creation and deletion of the zookeeper node.
> You will also have to tune the heartbeat between zookeeper client and
> server so that it fits your requirements.
> Does that suit you?
> Any other options?
> Good luck :-)
> 
> 
> On Wed, Sep 11, 2013 at 8:17 AM, fengguang gong <go...@icloud.com>wrote:
> 
>> Hi all:
>> 
>>        Recently my lab want to use zk to manager our cluster(Fault
>> detect). Our cluster
>> includes three kinds of node:
>> 1. dispatch node : load balance and dispatch data.
>> 2. store node: receive data from dispatch node and store.
>> 3. middleware: query data from all the store node.
>> Here my question is : How dose zk handle fault-detect in this system(how
>> dose dispatch node
>> and middleware know that a store node is down).

Re: How dose zookeeper handle fault-detect in distributed storage system

Posted by German Blanco <ge...@gmail.com>.

Hello Fengguang Gong,
I think there is more than one answer to your question.
One possibility would be to have each of your nodes as zookeeper clients
that create an ephemeral node in the zookeeper data, and query and most
likely subscribe to changes so that they are notified about the status of
the ephemeral zookeeper nodes created by the rest of the nodes. If you are
only interested in dispatch and middleware nodes knowing about the status
of store nodes, then you could have ephemeral zookeeper nodes created only
by the store nodes, and dispatch and middleware nodes querying and
subscribing to the resulting status.
You will need to make sure that the events of store nodes going up and down
are reflected correctly in the creation and deletion of the zookeeper node.
You will also have to tune the heartbeat between zookeeper client and
server so that it fits your requirements.
Does that suit you?
Any other options?
Good luck :-)

On Wed, Sep 11, 2013 at 8:17 AM, fengguang gong <go...@icloud.com>wrote:

> Hi all:
>
>         Recently my lab want to use zk to manager our cluster(Fault
> detect). Our cluster
> includes three kinds of node:
> 1. dispatch node : load balance and dispatch data.
> 2. store node: receive data from dispatch node and store.
> 3. middleware: query data from all the store node.
> Here my question is : How dose zk handle fault-detect in this system(how
> dose dispatch node
> and middleware know that a store node is down).