You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by mufc_fan <ra...@gmail.com> on 2014/08/18 07:55:06 UTC

zookeeper in hadoop

I am new to zookeeper and from the documentation I learnt that "zookeeper is
a sub-project of hadoop". so zookeeper must be used in hadoop. Can any one
share their knowledge how zookeeper is used in hadoop with simple example.



--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/zookeeper-in-hadoop-tp7580179.html
Sent from the zookeeper-user mailing list archive at Nabble.com.

RE: zookeeper in hadoop

Posted by Rakesh R <ra...@huawei.com>.
>>>>>> what my doubt is from documentation if namenode maintains an persistent session and the created node will not be deleted and how will it trigger standby namenode"

HDFS community is the right place to query this doubt. 
As I know NN uses ephemeral znodes to handle machine crashes and triggers watch event for switching standby to active.

Regards,
Rakesh

-----Original Message-----
From: mufc_fan [mailto:rajeshkumarit8292@gmail.com] 
Sent: 18 August 2014 14:39
To: zookeeper-user@hadoop.apache.org
Subject: RE: zookeeper in hadoop

thank u for ur response...I have one doubt..In name node high availability, from documentation, it  states that "each of the NameNode machines in the cluster maintains a persistent session in ZooKeeper. If the machine crashes, the ZooKeeper session will expire, notifying the other NameNode that a failover should be triggered"

According to my knowledge "if namenode creates an ephemeral node in zookeeper, when the machine crashes the node gets delete which triggers the watch event so that standby name node creates new ephemeral seesion with zookeeper and becomes new namenode". 

what my doubt is from documentation if namenode maintains an persistent session and the created node will not be deleted and how will it trigger standby namenode"



--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/zookeeper-in-hadoop-tp7580179p7580184.html
Sent from the zookeeper-user mailing list archive at Nabble.com.

RE: zookeeper in hadoop

Posted by mufc_fan <ra...@gmail.com>.
thank u for ur response...I have one doubt..In name node high availability,
from documentation, it  states that "each of the NameNode machines in the
cluster maintains a persistent session in ZooKeeper. If the machine crashes,
the ZooKeeper session will expire, notifying the other NameNode that a
failover should be triggered"

According to my knowledge "if namenode creates an ephemeral node in
zookeeper, when the machine crashes the node gets delete which triggers the
watch event so that standby name node creates new ephemeral seesion with
zookeeper and becomes new namenode". 

what my doubt is from documentation if namenode maintains an persistent
session and the created node will not be deleted and how will it trigger
standby namenode"



--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/zookeeper-in-hadoop-tp7580179p7580184.html
Sent from the zookeeper-user mailing list archive at Nabble.com.

RE: zookeeper in hadoop

Posted by Rakesh R <ra...@huawei.com>.
Hi,

AFAIK following are few cases where ZK is used in Hadoop & BookKeeper components.

Hadoop is using ZK to provides the high-availability of the Master process like,

1) HDFS is using ZK for making the NameNode highly available. 
Also, BKJM(uses BookKeeper as journal manager) which is a sub-module in the HDFS to store WAL(editlog transactions)
http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html

2) YARN is using ZK for making the ResourceManager highly available. 
Also, they are using it for storing the state of the RM in ZooKeeper(ZKRMStateStore: A ZooKeeper-based state-store implementation)
http://hadoop.apache.org/docs/r2.4.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html#Configurations


BookKeeper is using ZK for :

1) service discovery - to know the details about the running BK servers
2) metadata storage - keeps all the meta data about the ledgers(user data)
3) Auto-recovery - handle missing replicas to improve the fault-tolerance.
http://zookeeper.apache.org/bookkeeper/


Regards,
Rakesh
-----Original Message-----
From: mufc_fan [mailto:rajeshkumarit8292@gmail.com] 
Sent: 18 August 2014 11:25
To: zookeeper-user@hadoop.apache.org
Subject: zookeeper in hadoop

I am new to zookeeper and from the documentation I learnt that "zookeeper is a sub-project of hadoop". so zookeeper must be used in hadoop. Can any one share their knowledge how zookeeper is used in hadoop with simple example.



--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/zookeeper-in-hadoop-tp7580179.html
Sent from the zookeeper-user mailing list archive at Nabble.com.