You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Colin McCabe <cm...@alumni.cmu.edu> on 2013/01/14 20:49:57 UTC

Re: question about ZKFC daemon

Hi ESGLinux,

In production, you need to run QJM on at least 3 nodes.  You also need
to run ZKFC on at least 3 nodes.  You can run them on the same nodes
if you like, though.

Of course, none of this is "needed" to set up an example cluster.  If
you just want to try something out, you can run everything on the same
node if you want.  It depends on what you're trying to do.

cheers,
Colin


On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
> Thank you for your answer Craig,
>
> I´m planning my cluster and for now I´m not sure how many machines I need;-)
>
> If I have doubt i´ll what clouder say and If have a problem I have where to
> ask for explications :-)
>
> ESGLinux
>
>
>
> 2012/12/28 Craig Munro <cr...@gmail.com>
>>
>> OK, I have reliable storage on my datanodes so not an issue for me.  If
>> that's what Cloudera recommends then I'm sure it's fine.
>>
>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>
>>> Hi Craig,
>>>
>>> I´m a bit confused, I have read this from cloudera:
>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>
>>> The JournalNode daemon is relatively lightweight, so these daemons can
>>> reasonably be collocated on machines with other Hadoop daemons, for example
>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.) so the
>>> JournalNodes' local directories can use the reliable local storage on those
>>> machines.
>>> There must be at least three JournalNode daemons, since edit log
>>> modifications must be written to a majority of JournalNodes
>>>
>>> as you can read they recommend to put journalnode daemons with the
>>> namenodes, but you say the opposite.??¿?¿??
>>>
>>>
>>> Thanks for your answer,
>>>
>>> ESGLinux,
>>>
>>>
>>>
>>>
>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>
>>>> You need the following:
>>>>
>>>> - active namenode + zkfc
>>>> - standby namenode + zkfc
>>>> - pool of journal nodes (odd number, 3 or more)
>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>
>>>> As the journal nodes hold the namesystem transactions they should not be
>>>> co-located with the namenodes in case of failure.  I distribute the journal
>>>> and zookeeper nodes across the hosts running datanodes or as Harsh says you
>>>> could co-locate them on dedicated hosts.
>>>>
>>>> ZKFC does not monitor the JobTracker.
>>>>
>>>> Regards,
>>>> Craig
>>>>
>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> well, If I have understand you I can configure my NN HA cluster this
>>>>> way:
>>>>>
>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>>
>>>>> Is this right?
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> ESGLinux,
>>>>>
>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> There are two different things here: Automatic Failover and Quorum
>>>>>> Journal Manager. The former, used via a ZooKeeper Failover Controller,
>>>>>> is to manage failovers automatically (based on health checks of NNs).
>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>>>>>> storage for namesystem transactions that helps enable HA.
>>>>>>
>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>>>>>> reliable HA, preferably on nodes of their own if possible (like you
>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>>> quorum).
>>>>>>
>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com> wrote:
>>>>>> > Hi all,
>>>>>> >
>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>>> > cluster,
>>>>>> >
>>>>>> > As far as I know, I need at least three nodes to run three ZooKeeper
>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this way:
>>>>>> >
>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>>> >
>>>>>> > so the quorum is formed with these three nodes. The nodes that runs
>>>>>> > a
>>>>>> > namenode are right because the ZKFC monitors it, but what does the
>>>>>> > third
>>>>>> > daemon?
>>>>>> >
>>>>>> > as I read from this url:
>>>>>> >
>>>>>> > https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>>> >
>>>>>> > this daemons are only related with NameNodes, (Health monitoring -
>>>>>> > the ZKFC
>>>>>> > pings its local NameNode on a periodic basis with a health-check
>>>>>> > command.)
>>>>>> > so what does the third ZKFC? I used the jobtracker node but I could
>>>>>> > use
>>>>>> > another node without any daemon on it...
>>>>>> >
>>>>>> > Thanks in advance,
>>>>>> >
>>>>>> > ESGLInux,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>
>>>>>
>>>
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
ok,

Thats the origin of my confussion, I thought they were the same.
I´m going to read this doc to bring me a bit of light about ZooKeeper..

thank you very much for your help,

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> No, ZooKeeper daemons == http://zookeeper.apache.org.
>
>
> On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi Harsh,
>>
>> Now I´m confussed at all :-))))
>>
>> as you pointed ZKFC runs only in the NN. That´s looks right.
>>
>> So, what are ZK peers (the odd number I´m looking for) and where I have
>> to run them? on another 3 nodes?
>>
>> As I can read from the previous url:
>>
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes. Since ZooKeeper itself has light resource requirements, it
>> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
>> HDFS NameNode and Standby Node. Many operators choose to deploy the third
>> ZooKeeper process on the same node as the YARN ResourceManager. It is
>> advisable to configure the ZooKeeper nodes to store their data on separate
>> disk drives from the HDFS metadata for best performance and isolation.
>>
>> Here,  ZooKeeper daemons = ZKFC?
>>
>>
>> Thanks
>>
>> ESGLinux,
>>
>>
>>
>> 2013/1/15 Harsh J <ha...@cloudera.com>
>>
>>> Hi,
>>>
>>> I fail to see your confusion.
>>>
>>> ZKFC != ZK
>>>
>>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>>> numbers, such as JNs are to be.
>>>
>>> ZKFC is something the NN needs for its Automatic Failover capability. It
>>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>>
>>>
>>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I´m only testing the new HA feature. I´m not in a production system,
>>>>
>>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>>
>>>> In this url:
>>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>>
>>>> you can read:
>>>> If you have configured automatic failover using the ZooKeeper
>>>> FailoverController (ZKFC), you must install and start thezkfc daemon
>>>> on
>>>> each of the machines that runs a NameNode.
>>>>
>>>> So, the number of ZKFC daemons are two, but reading this url:
>>>>
>>>>
>>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>>
>>>> you can read this:
>>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>>> three or five nodes
>>>>
>>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>>> (active+standby).
>>>>
>>>> So I´m a bit confussed with this deployment...
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks in advance for all your answers
>>>>
>>>> Kind regards,
>>>>
>>>> ESGLinux
>>>>
>>>>
>>>>
>>>>
>>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>>
>>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>>> wrote:
>>>>> > Hi ESGLinux,
>>>>> >
>>>>> > In production, you need to run QJM on at least 3 nodes.  You also
>>>>> need
>>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>>> > if you like, though.
>>>>>
>>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>>> active NN node and the standby NN node.
>>>>>
>>>>> Colin
>>>>>
>>>>> >
>>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>>> > you just want to try something out, you can run everything on the
>>>>> same
>>>>> > node if you want.  It depends on what you're trying to do.
>>>>> >
>>>>> > cheers,
>>>>> > Colin
>>>>> >
>>>>> >
>>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >> Thank you for your answer Craig,
>>>>> >>
>>>>> >> I´m planning my cluster and for now I´m not sure how many machines
>>>>> I need;-)
>>>>> >>
>>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>>> where to
>>>>> >> ask for explications :-)
>>>>> >>
>>>>> >> ESGLinux
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>
>>>>> >>> OK, I have reliable storage on my datanodes so not an issue for
>>>>> me.  If
>>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>>> >>>
>>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>
>>>>> >>>> Hi Craig,
>>>>> >>>>
>>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>>> >>>>
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>> >>>>
>>>>> >>>> The JournalNode daemon is relatively lightweight, so these
>>>>> daemons can
>>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>>> for example
>>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>>> etc.) so the
>>>>> >>>> JournalNodes' local directories can use the reliable local
>>>>> storage on those
>>>>> >>>> machines.
>>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>>> >>>> modifications must be written to a majority of JournalNodes
>>>>> >>>>
>>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks for your answer,
>>>>> >>>>
>>>>> >>>> ESGLinux,
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>>>
>>>>> >>>>> You need the following:
>>>>> >>>>>
>>>>> >>>>> - active namenode + zkfc
>>>>> >>>>> - standby namenode + zkfc
>>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>> >>>>>
>>>>> >>>>> As the journal nodes hold the namesystem transactions they
>>>>> should not be
>>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>>> the journal
>>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>>> Harsh says you
>>>>> >>>>> could co-locate them on dedicated hosts.
>>>>> >>>>>
>>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>>> >>>>>
>>>>> >>>>> Regards,
>>>>> >>>>> Craig
>>>>> >>>>>
>>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>>> this
>>>>> >>>>>> way:
>>>>> >>>>>>
>>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>> >>>>>>
>>>>> >>>>>> Is this right?
>>>>> >>>>>>
>>>>> >>>>>> Thanks in advance,
>>>>> >>>>>>
>>>>> >>>>>> ESGLinux,
>>>>> >>>>>>
>>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>> >>>>>>>
>>>>> >>>>>>> Hi,
>>>>> >>>>>>>
>>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>>> Quorum
>>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>>> Controller,
>>>>> >>>>>>> is to manage failovers automatically (based on health checks
>>>>> of NNs).
>>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>>> shared
>>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>>> >>>>>>>
>>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>>> for
>>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible
>>>>> (like you
>>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>>> as
>>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>> >>>>>>> quorum).
>>>>> >>>>>>>
>>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >>>>>>> > Hi all,
>>>>> >>>>>>> >
>>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>> >>>>>>> > cluster,
>>>>> >>>>>>> >
>>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>>> ZooKeeper
>>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>>> this way:
>>>>> >>>>>>> >
>>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>> >>>>>>> >
>>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>>> that runs
>>>>> >>>>>>> > a
>>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>>> does the
>>>>> >>>>>>> > third
>>>>> >>>>>>> > daemon?
>>>>> >>>>>>> >
>>>>> >>>>>>> > as I read from this url:
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>> >>>>>>> >
>>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>>> monitoring -
>>>>> >>>>>>> > the ZKFC
>>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>>> health-check
>>>>> >>>>>>> > command.)
>>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but
>>>>> I could
>>>>> >>>>>>> > use
>>>>> >>>>>>> > another node without any daemon on it...
>>>>> >>>>>>> >
>>>>> >>>>>>> > Thanks in advance,
>>>>> >>>>>>> >
>>>>> >>>>>>> > ESGLInux,
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> --
>>>>> >>>>>>> Harsh J
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
ok,

Thats the origin of my confussion, I thought they were the same.
I´m going to read this doc to bring me a bit of light about ZooKeeper..

thank you very much for your help,

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> No, ZooKeeper daemons == http://zookeeper.apache.org.
>
>
> On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi Harsh,
>>
>> Now I´m confussed at all :-))))
>>
>> as you pointed ZKFC runs only in the NN. That´s looks right.
>>
>> So, what are ZK peers (the odd number I´m looking for) and where I have
>> to run them? on another 3 nodes?
>>
>> As I can read from the previous url:
>>
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes. Since ZooKeeper itself has light resource requirements, it
>> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
>> HDFS NameNode and Standby Node. Many operators choose to deploy the third
>> ZooKeeper process on the same node as the YARN ResourceManager. It is
>> advisable to configure the ZooKeeper nodes to store their data on separate
>> disk drives from the HDFS metadata for best performance and isolation.
>>
>> Here,  ZooKeeper daemons = ZKFC?
>>
>>
>> Thanks
>>
>> ESGLinux,
>>
>>
>>
>> 2013/1/15 Harsh J <ha...@cloudera.com>
>>
>>> Hi,
>>>
>>> I fail to see your confusion.
>>>
>>> ZKFC != ZK
>>>
>>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>>> numbers, such as JNs are to be.
>>>
>>> ZKFC is something the NN needs for its Automatic Failover capability. It
>>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>>
>>>
>>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I´m only testing the new HA feature. I´m not in a production system,
>>>>
>>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>>
>>>> In this url:
>>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>>
>>>> you can read:
>>>> If you have configured automatic failover using the ZooKeeper
>>>> FailoverController (ZKFC), you must install and start thezkfc daemon
>>>> on
>>>> each of the machines that runs a NameNode.
>>>>
>>>> So, the number of ZKFC daemons are two, but reading this url:
>>>>
>>>>
>>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>>
>>>> you can read this:
>>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>>> three or five nodes
>>>>
>>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>>> (active+standby).
>>>>
>>>> So I´m a bit confussed with this deployment...
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks in advance for all your answers
>>>>
>>>> Kind regards,
>>>>
>>>> ESGLinux
>>>>
>>>>
>>>>
>>>>
>>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>>
>>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>>> wrote:
>>>>> > Hi ESGLinux,
>>>>> >
>>>>> > In production, you need to run QJM on at least 3 nodes.  You also
>>>>> need
>>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>>> > if you like, though.
>>>>>
>>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>>> active NN node and the standby NN node.
>>>>>
>>>>> Colin
>>>>>
>>>>> >
>>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>>> > you just want to try something out, you can run everything on the
>>>>> same
>>>>> > node if you want.  It depends on what you're trying to do.
>>>>> >
>>>>> > cheers,
>>>>> > Colin
>>>>> >
>>>>> >
>>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >> Thank you for your answer Craig,
>>>>> >>
>>>>> >> I´m planning my cluster and for now I´m not sure how many machines
>>>>> I need;-)
>>>>> >>
>>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>>> where to
>>>>> >> ask for explications :-)
>>>>> >>
>>>>> >> ESGLinux
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>
>>>>> >>> OK, I have reliable storage on my datanodes so not an issue for
>>>>> me.  If
>>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>>> >>>
>>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>
>>>>> >>>> Hi Craig,
>>>>> >>>>
>>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>>> >>>>
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>> >>>>
>>>>> >>>> The JournalNode daemon is relatively lightweight, so these
>>>>> daemons can
>>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>>> for example
>>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>>> etc.) so the
>>>>> >>>> JournalNodes' local directories can use the reliable local
>>>>> storage on those
>>>>> >>>> machines.
>>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>>> >>>> modifications must be written to a majority of JournalNodes
>>>>> >>>>
>>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks for your answer,
>>>>> >>>>
>>>>> >>>> ESGLinux,
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>>>
>>>>> >>>>> You need the following:
>>>>> >>>>>
>>>>> >>>>> - active namenode + zkfc
>>>>> >>>>> - standby namenode + zkfc
>>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>> >>>>>
>>>>> >>>>> As the journal nodes hold the namesystem transactions they
>>>>> should not be
>>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>>> the journal
>>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>>> Harsh says you
>>>>> >>>>> could co-locate them on dedicated hosts.
>>>>> >>>>>
>>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>>> >>>>>
>>>>> >>>>> Regards,
>>>>> >>>>> Craig
>>>>> >>>>>
>>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>>> this
>>>>> >>>>>> way:
>>>>> >>>>>>
>>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>> >>>>>>
>>>>> >>>>>> Is this right?
>>>>> >>>>>>
>>>>> >>>>>> Thanks in advance,
>>>>> >>>>>>
>>>>> >>>>>> ESGLinux,
>>>>> >>>>>>
>>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>> >>>>>>>
>>>>> >>>>>>> Hi,
>>>>> >>>>>>>
>>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>>> Quorum
>>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>>> Controller,
>>>>> >>>>>>> is to manage failovers automatically (based on health checks
>>>>> of NNs).
>>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>>> shared
>>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>>> >>>>>>>
>>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>>> for
>>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible
>>>>> (like you
>>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>>> as
>>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>> >>>>>>> quorum).
>>>>> >>>>>>>
>>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >>>>>>> > Hi all,
>>>>> >>>>>>> >
>>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>> >>>>>>> > cluster,
>>>>> >>>>>>> >
>>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>>> ZooKeeper
>>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>>> this way:
>>>>> >>>>>>> >
>>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>> >>>>>>> >
>>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>>> that runs
>>>>> >>>>>>> > a
>>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>>> does the
>>>>> >>>>>>> > third
>>>>> >>>>>>> > daemon?
>>>>> >>>>>>> >
>>>>> >>>>>>> > as I read from this url:
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>> >>>>>>> >
>>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>>> monitoring -
>>>>> >>>>>>> > the ZKFC
>>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>>> health-check
>>>>> >>>>>>> > command.)
>>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but
>>>>> I could
>>>>> >>>>>>> > use
>>>>> >>>>>>> > another node without any daemon on it...
>>>>> >>>>>>> >
>>>>> >>>>>>> > Thanks in advance,
>>>>> >>>>>>> >
>>>>> >>>>>>> > ESGLInux,
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> --
>>>>> >>>>>>> Harsh J
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
ok,

Thats the origin of my confussion, I thought they were the same.
I´m going to read this doc to bring me a bit of light about ZooKeeper..

thank you very much for your help,

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> No, ZooKeeper daemons == http://zookeeper.apache.org.
>
>
> On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi Harsh,
>>
>> Now I´m confussed at all :-))))
>>
>> as you pointed ZKFC runs only in the NN. That´s looks right.
>>
>> So, what are ZK peers (the odd number I´m looking for) and where I have
>> to run them? on another 3 nodes?
>>
>> As I can read from the previous url:
>>
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes. Since ZooKeeper itself has light resource requirements, it
>> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
>> HDFS NameNode and Standby Node. Many operators choose to deploy the third
>> ZooKeeper process on the same node as the YARN ResourceManager. It is
>> advisable to configure the ZooKeeper nodes to store their data on separate
>> disk drives from the HDFS metadata for best performance and isolation.
>>
>> Here,  ZooKeeper daemons = ZKFC?
>>
>>
>> Thanks
>>
>> ESGLinux,
>>
>>
>>
>> 2013/1/15 Harsh J <ha...@cloudera.com>
>>
>>> Hi,
>>>
>>> I fail to see your confusion.
>>>
>>> ZKFC != ZK
>>>
>>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>>> numbers, such as JNs are to be.
>>>
>>> ZKFC is something the NN needs for its Automatic Failover capability. It
>>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>>
>>>
>>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I´m only testing the new HA feature. I´m not in a production system,
>>>>
>>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>>
>>>> In this url:
>>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>>
>>>> you can read:
>>>> If you have configured automatic failover using the ZooKeeper
>>>> FailoverController (ZKFC), you must install and start thezkfc daemon
>>>> on
>>>> each of the machines that runs a NameNode.
>>>>
>>>> So, the number of ZKFC daemons are two, but reading this url:
>>>>
>>>>
>>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>>
>>>> you can read this:
>>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>>> three or five nodes
>>>>
>>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>>> (active+standby).
>>>>
>>>> So I´m a bit confussed with this deployment...
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks in advance for all your answers
>>>>
>>>> Kind regards,
>>>>
>>>> ESGLinux
>>>>
>>>>
>>>>
>>>>
>>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>>
>>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>>> wrote:
>>>>> > Hi ESGLinux,
>>>>> >
>>>>> > In production, you need to run QJM on at least 3 nodes.  You also
>>>>> need
>>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>>> > if you like, though.
>>>>>
>>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>>> active NN node and the standby NN node.
>>>>>
>>>>> Colin
>>>>>
>>>>> >
>>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>>> > you just want to try something out, you can run everything on the
>>>>> same
>>>>> > node if you want.  It depends on what you're trying to do.
>>>>> >
>>>>> > cheers,
>>>>> > Colin
>>>>> >
>>>>> >
>>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >> Thank you for your answer Craig,
>>>>> >>
>>>>> >> I´m planning my cluster and for now I´m not sure how many machines
>>>>> I need;-)
>>>>> >>
>>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>>> where to
>>>>> >> ask for explications :-)
>>>>> >>
>>>>> >> ESGLinux
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>
>>>>> >>> OK, I have reliable storage on my datanodes so not an issue for
>>>>> me.  If
>>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>>> >>>
>>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>
>>>>> >>>> Hi Craig,
>>>>> >>>>
>>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>>> >>>>
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>> >>>>
>>>>> >>>> The JournalNode daemon is relatively lightweight, so these
>>>>> daemons can
>>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>>> for example
>>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>>> etc.) so the
>>>>> >>>> JournalNodes' local directories can use the reliable local
>>>>> storage on those
>>>>> >>>> machines.
>>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>>> >>>> modifications must be written to a majority of JournalNodes
>>>>> >>>>
>>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks for your answer,
>>>>> >>>>
>>>>> >>>> ESGLinux,
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>>>
>>>>> >>>>> You need the following:
>>>>> >>>>>
>>>>> >>>>> - active namenode + zkfc
>>>>> >>>>> - standby namenode + zkfc
>>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>> >>>>>
>>>>> >>>>> As the journal nodes hold the namesystem transactions they
>>>>> should not be
>>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>>> the journal
>>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>>> Harsh says you
>>>>> >>>>> could co-locate them on dedicated hosts.
>>>>> >>>>>
>>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>>> >>>>>
>>>>> >>>>> Regards,
>>>>> >>>>> Craig
>>>>> >>>>>
>>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>>> this
>>>>> >>>>>> way:
>>>>> >>>>>>
>>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>> >>>>>>
>>>>> >>>>>> Is this right?
>>>>> >>>>>>
>>>>> >>>>>> Thanks in advance,
>>>>> >>>>>>
>>>>> >>>>>> ESGLinux,
>>>>> >>>>>>
>>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>> >>>>>>>
>>>>> >>>>>>> Hi,
>>>>> >>>>>>>
>>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>>> Quorum
>>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>>> Controller,
>>>>> >>>>>>> is to manage failovers automatically (based on health checks
>>>>> of NNs).
>>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>>> shared
>>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>>> >>>>>>>
>>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>>> for
>>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible
>>>>> (like you
>>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>>> as
>>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>> >>>>>>> quorum).
>>>>> >>>>>>>
>>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >>>>>>> > Hi all,
>>>>> >>>>>>> >
>>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>> >>>>>>> > cluster,
>>>>> >>>>>>> >
>>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>>> ZooKeeper
>>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>>> this way:
>>>>> >>>>>>> >
>>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>> >>>>>>> >
>>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>>> that runs
>>>>> >>>>>>> > a
>>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>>> does the
>>>>> >>>>>>> > third
>>>>> >>>>>>> > daemon?
>>>>> >>>>>>> >
>>>>> >>>>>>> > as I read from this url:
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>> >>>>>>> >
>>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>>> monitoring -
>>>>> >>>>>>> > the ZKFC
>>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>>> health-check
>>>>> >>>>>>> > command.)
>>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but
>>>>> I could
>>>>> >>>>>>> > use
>>>>> >>>>>>> > another node without any daemon on it...
>>>>> >>>>>>> >
>>>>> >>>>>>> > Thanks in advance,
>>>>> >>>>>>> >
>>>>> >>>>>>> > ESGLInux,
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> --
>>>>> >>>>>>> Harsh J
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
ok,

Thats the origin of my confussion, I thought they were the same.
I´m going to read this doc to bring me a bit of light about ZooKeeper..

thank you very much for your help,

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> No, ZooKeeper daemons == http://zookeeper.apache.org.
>
>
> On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi Harsh,
>>
>> Now I´m confussed at all :-))))
>>
>> as you pointed ZKFC runs only in the NN. That´s looks right.
>>
>> So, what are ZK peers (the odd number I´m looking for) and where I have
>> to run them? on another 3 nodes?
>>
>> As I can read from the previous url:
>>
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes. Since ZooKeeper itself has light resource requirements, it
>> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
>> HDFS NameNode and Standby Node. Many operators choose to deploy the third
>> ZooKeeper process on the same node as the YARN ResourceManager. It is
>> advisable to configure the ZooKeeper nodes to store their data on separate
>> disk drives from the HDFS metadata for best performance and isolation.
>>
>> Here,  ZooKeeper daemons = ZKFC?
>>
>>
>> Thanks
>>
>> ESGLinux,
>>
>>
>>
>> 2013/1/15 Harsh J <ha...@cloudera.com>
>>
>>> Hi,
>>>
>>> I fail to see your confusion.
>>>
>>> ZKFC != ZK
>>>
>>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>>> numbers, such as JNs are to be.
>>>
>>> ZKFC is something the NN needs for its Automatic Failover capability. It
>>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>>
>>>
>>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I´m only testing the new HA feature. I´m not in a production system,
>>>>
>>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>>
>>>> In this url:
>>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>>
>>>> you can read:
>>>> If you have configured automatic failover using the ZooKeeper
>>>> FailoverController (ZKFC), you must install and start thezkfc daemon
>>>> on
>>>> each of the machines that runs a NameNode.
>>>>
>>>> So, the number of ZKFC daemons are two, but reading this url:
>>>>
>>>>
>>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>>
>>>> you can read this:
>>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>>> three or five nodes
>>>>
>>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>>> (active+standby).
>>>>
>>>> So I´m a bit confussed with this deployment...
>>>>
>>>> Any suggestion?
>>>>
>>>> Thanks in advance for all your answers
>>>>
>>>> Kind regards,
>>>>
>>>> ESGLinux
>>>>
>>>>
>>>>
>>>>
>>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>>
>>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>>> wrote:
>>>>> > Hi ESGLinux,
>>>>> >
>>>>> > In production, you need to run QJM on at least 3 nodes.  You also
>>>>> need
>>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>>> > if you like, though.
>>>>>
>>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>>> active NN node and the standby NN node.
>>>>>
>>>>> Colin
>>>>>
>>>>> >
>>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>>> > you just want to try something out, you can run everything on the
>>>>> same
>>>>> > node if you want.  It depends on what you're trying to do.
>>>>> >
>>>>> > cheers,
>>>>> > Colin
>>>>> >
>>>>> >
>>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >> Thank you for your answer Craig,
>>>>> >>
>>>>> >> I´m planning my cluster and for now I´m not sure how many machines
>>>>> I need;-)
>>>>> >>
>>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>>> where to
>>>>> >> ask for explications :-)
>>>>> >>
>>>>> >> ESGLinux
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>
>>>>> >>> OK, I have reliable storage on my datanodes so not an issue for
>>>>> me.  If
>>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>>> >>>
>>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>
>>>>> >>>> Hi Craig,
>>>>> >>>>
>>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>>> >>>>
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>> >>>>
>>>>> >>>> The JournalNode daemon is relatively lightweight, so these
>>>>> daemons can
>>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>>> for example
>>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>>> etc.) so the
>>>>> >>>> JournalNodes' local directories can use the reliable local
>>>>> storage on those
>>>>> >>>> machines.
>>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>>> >>>> modifications must be written to a majority of JournalNodes
>>>>> >>>>
>>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> Thanks for your answer,
>>>>> >>>>
>>>>> >>>> ESGLinux,
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>> >>>>>
>>>>> >>>>> You need the following:
>>>>> >>>>>
>>>>> >>>>> - active namenode + zkfc
>>>>> >>>>> - standby namenode + zkfc
>>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>> >>>>>
>>>>> >>>>> As the journal nodes hold the namesystem transactions they
>>>>> should not be
>>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>>> the journal
>>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>>> Harsh says you
>>>>> >>>>> could co-locate them on dedicated hosts.
>>>>> >>>>>
>>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>>> >>>>>
>>>>> >>>>> Regards,
>>>>> >>>>> Craig
>>>>> >>>>>
>>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hi,
>>>>> >>>>>>
>>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>>> this
>>>>> >>>>>> way:
>>>>> >>>>>>
>>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>> >>>>>>
>>>>> >>>>>> Is this right?
>>>>> >>>>>>
>>>>> >>>>>> Thanks in advance,
>>>>> >>>>>>
>>>>> >>>>>> ESGLinux,
>>>>> >>>>>>
>>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>> >>>>>>>
>>>>> >>>>>>> Hi,
>>>>> >>>>>>>
>>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>>> Quorum
>>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>>> Controller,
>>>>> >>>>>>> is to manage failovers automatically (based on health checks
>>>>> of NNs).
>>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>>> shared
>>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>>> >>>>>>>
>>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>>> for
>>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible
>>>>> (like you
>>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>>> as
>>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>> >>>>>>> quorum).
>>>>> >>>>>>>
>>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>>> wrote:
>>>>> >>>>>>> > Hi all,
>>>>> >>>>>>> >
>>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>> >>>>>>> > cluster,
>>>>> >>>>>>> >
>>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>>> ZooKeeper
>>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>>> this way:
>>>>> >>>>>>> >
>>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>> >>>>>>> >
>>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>>> that runs
>>>>> >>>>>>> > a
>>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>>> does the
>>>>> >>>>>>> > third
>>>>> >>>>>>> > daemon?
>>>>> >>>>>>> >
>>>>> >>>>>>> > as I read from this url:
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>> >>>>>>> >
>>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>>> monitoring -
>>>>> >>>>>>> > the ZKFC
>>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>>> health-check
>>>>> >>>>>>> > command.)
>>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but
>>>>> I could
>>>>> >>>>>>> > use
>>>>> >>>>>>> > another node without any daemon on it...
>>>>> >>>>>>> >
>>>>> >>>>>>> > Thanks in advance,
>>>>> >>>>>>> >
>>>>> >>>>>>> > ESGLInux,
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> --
>>>>> >>>>>>> Harsh J
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
No, ZooKeeper daemons == http://zookeeper.apache.org.


On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:

> Hi Harsh,
>
> Now I´m confussed at all :-))))
>
> as you pointed ZKFC runs only in the NN. That´s looks right.
>
> So, what are ZK peers (the odd number I´m looking for) and where I have to
> run them? on another 3 nodes?
>
> As I can read from the previous url:
>
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes. Since ZooKeeper itself has light resource requirements, it
> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
> HDFS NameNode and Standby Node. Many operators choose to deploy the third
> ZooKeeper process on the same node as the YARN ResourceManager. It is
> advisable to configure the ZooKeeper nodes to store their data on separate
> disk drives from the HDFS metadata for best performance and isolation.
>
> Here,  ZooKeeper daemons = ZKFC?
>
>
> Thanks
>
> ESGLinux,
>
>
>
> 2013/1/15 Harsh J <ha...@cloudera.com>
>
>> Hi,
>>
>> I fail to see your confusion.
>>
>> ZKFC != ZK
>>
>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>> numbers, such as JNs are to be.
>>
>> ZKFC is something the NN needs for its Automatic Failover capability. It
>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>
>>
>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I´m only testing the new HA feature. I´m not in a production system,
>>>
>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>
>>> In this url:
>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>
>>> you can read:
>>> If you have configured automatic failover using the ZooKeeper
>>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>>> each of the machines that runs a NameNode.
>>>
>>> So, the number of ZKFC daemons are two, but reading this url:
>>>
>>>
>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>
>>> you can read this:
>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>> three or five nodes
>>>
>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>> (active+standby).
>>>
>>> So I´m a bit confussed with this deployment...
>>>
>>> Any suggestion?
>>>
>>> Thanks in advance for all your answers
>>>
>>> Kind regards,
>>>
>>> ESGLinux
>>>
>>>
>>>
>>>
>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>
>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>> wrote:
>>>> > Hi ESGLinux,
>>>> >
>>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>> > if you like, though.
>>>>
>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>> active NN node and the standby NN node.
>>>>
>>>> Colin
>>>>
>>>> >
>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>> > you just want to try something out, you can run everything on the same
>>>> > node if you want.  It depends on what you're trying to do.
>>>> >
>>>> > cheers,
>>>> > Colin
>>>> >
>>>> >
>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >> Thank you for your answer Craig,
>>>> >>
>>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>>> need;-)
>>>> >>
>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>> where to
>>>> >> ask for explications :-)
>>>> >>
>>>> >> ESGLinux
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>
>>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>>  If
>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>> >>>
>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hi Craig,
>>>> >>>>
>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>> >>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>> >>>>
>>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>>> can
>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>> for example
>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>> etc.) so the
>>>> >>>> JournalNodes' local directories can use the reliable local storage
>>>> on those
>>>> >>>> machines.
>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>> >>>> modifications must be written to a majority of JournalNodes
>>>> >>>>
>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>> >>>>
>>>> >>>>
>>>> >>>> Thanks for your answer,
>>>> >>>>
>>>> >>>> ESGLinux,
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>>>
>>>> >>>>> You need the following:
>>>> >>>>>
>>>> >>>>> - active namenode + zkfc
>>>> >>>>> - standby namenode + zkfc
>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>> >>>>>
>>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>>> not be
>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>> the journal
>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>> Harsh says you
>>>> >>>>> could co-locate them on dedicated hosts.
>>>> >>>>>
>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>> >>>>>
>>>> >>>>> Regards,
>>>> >>>>> Craig
>>>> >>>>>
>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>> this
>>>> >>>>>> way:
>>>> >>>>>>
>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>> >>>>>>
>>>> >>>>>> Is this right?
>>>> >>>>>>
>>>> >>>>>> Thanks in advance,
>>>> >>>>>>
>>>> >>>>>> ESGLinux,
>>>> >>>>>>
>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>> >>>>>>>
>>>> >>>>>>> Hi,
>>>> >>>>>>>
>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>> Quorum
>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>> Controller,
>>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>>> NNs).
>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>> shared
>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>> >>>>>>>
>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>> for
>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>>> you
>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>> as
>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>> >>>>>>> quorum).
>>>> >>>>>>>
>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >>>>>>> > Hi all,
>>>> >>>>>>> >
>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>> >>>>>>> > cluster,
>>>> >>>>>>> >
>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>> ZooKeeper
>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>> this way:
>>>> >>>>>>> >
>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>> >>>>>>> >
>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>> that runs
>>>> >>>>>>> > a
>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>> does the
>>>> >>>>>>> > third
>>>> >>>>>>> > daemon?
>>>> >>>>>>> >
>>>> >>>>>>> > as I read from this url:
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>> >>>>>>> >
>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>> monitoring -
>>>> >>>>>>> > the ZKFC
>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>> health-check
>>>> >>>>>>> > command.)
>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>>> could
>>>> >>>>>>> > use
>>>> >>>>>>> > another node without any daemon on it...
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks in advance,
>>>> >>>>>>> >
>>>> >>>>>>> > ESGLInux,
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --
>>>> >>>>>>> Harsh J
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>
>>>> >>
>>>>
>>>
>>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
No, ZooKeeper daemons == http://zookeeper.apache.org.


On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:

> Hi Harsh,
>
> Now I´m confussed at all :-))))
>
> as you pointed ZKFC runs only in the NN. That´s looks right.
>
> So, what are ZK peers (the odd number I´m looking for) and where I have to
> run them? on another 3 nodes?
>
> As I can read from the previous url:
>
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes. Since ZooKeeper itself has light resource requirements, it
> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
> HDFS NameNode and Standby Node. Many operators choose to deploy the third
> ZooKeeper process on the same node as the YARN ResourceManager. It is
> advisable to configure the ZooKeeper nodes to store their data on separate
> disk drives from the HDFS metadata for best performance and isolation.
>
> Here,  ZooKeeper daemons = ZKFC?
>
>
> Thanks
>
> ESGLinux,
>
>
>
> 2013/1/15 Harsh J <ha...@cloudera.com>
>
>> Hi,
>>
>> I fail to see your confusion.
>>
>> ZKFC != ZK
>>
>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>> numbers, such as JNs are to be.
>>
>> ZKFC is something the NN needs for its Automatic Failover capability. It
>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>
>>
>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I´m only testing the new HA feature. I´m not in a production system,
>>>
>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>
>>> In this url:
>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>
>>> you can read:
>>> If you have configured automatic failover using the ZooKeeper
>>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>>> each of the machines that runs a NameNode.
>>>
>>> So, the number of ZKFC daemons are two, but reading this url:
>>>
>>>
>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>
>>> you can read this:
>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>> three or five nodes
>>>
>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>> (active+standby).
>>>
>>> So I´m a bit confussed with this deployment...
>>>
>>> Any suggestion?
>>>
>>> Thanks in advance for all your answers
>>>
>>> Kind regards,
>>>
>>> ESGLinux
>>>
>>>
>>>
>>>
>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>
>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>> wrote:
>>>> > Hi ESGLinux,
>>>> >
>>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>> > if you like, though.
>>>>
>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>> active NN node and the standby NN node.
>>>>
>>>> Colin
>>>>
>>>> >
>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>> > you just want to try something out, you can run everything on the same
>>>> > node if you want.  It depends on what you're trying to do.
>>>> >
>>>> > cheers,
>>>> > Colin
>>>> >
>>>> >
>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >> Thank you for your answer Craig,
>>>> >>
>>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>>> need;-)
>>>> >>
>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>> where to
>>>> >> ask for explications :-)
>>>> >>
>>>> >> ESGLinux
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>
>>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>>  If
>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>> >>>
>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hi Craig,
>>>> >>>>
>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>> >>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>> >>>>
>>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>>> can
>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>> for example
>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>> etc.) so the
>>>> >>>> JournalNodes' local directories can use the reliable local storage
>>>> on those
>>>> >>>> machines.
>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>> >>>> modifications must be written to a majority of JournalNodes
>>>> >>>>
>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>> >>>>
>>>> >>>>
>>>> >>>> Thanks for your answer,
>>>> >>>>
>>>> >>>> ESGLinux,
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>>>
>>>> >>>>> You need the following:
>>>> >>>>>
>>>> >>>>> - active namenode + zkfc
>>>> >>>>> - standby namenode + zkfc
>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>> >>>>>
>>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>>> not be
>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>> the journal
>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>> Harsh says you
>>>> >>>>> could co-locate them on dedicated hosts.
>>>> >>>>>
>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>> >>>>>
>>>> >>>>> Regards,
>>>> >>>>> Craig
>>>> >>>>>
>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>> this
>>>> >>>>>> way:
>>>> >>>>>>
>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>> >>>>>>
>>>> >>>>>> Is this right?
>>>> >>>>>>
>>>> >>>>>> Thanks in advance,
>>>> >>>>>>
>>>> >>>>>> ESGLinux,
>>>> >>>>>>
>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>> >>>>>>>
>>>> >>>>>>> Hi,
>>>> >>>>>>>
>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>> Quorum
>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>> Controller,
>>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>>> NNs).
>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>> shared
>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>> >>>>>>>
>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>> for
>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>>> you
>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>> as
>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>> >>>>>>> quorum).
>>>> >>>>>>>
>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >>>>>>> > Hi all,
>>>> >>>>>>> >
>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>> >>>>>>> > cluster,
>>>> >>>>>>> >
>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>> ZooKeeper
>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>> this way:
>>>> >>>>>>> >
>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>> >>>>>>> >
>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>> that runs
>>>> >>>>>>> > a
>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>> does the
>>>> >>>>>>> > third
>>>> >>>>>>> > daemon?
>>>> >>>>>>> >
>>>> >>>>>>> > as I read from this url:
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>> >>>>>>> >
>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>> monitoring -
>>>> >>>>>>> > the ZKFC
>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>> health-check
>>>> >>>>>>> > command.)
>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>>> could
>>>> >>>>>>> > use
>>>> >>>>>>> > another node without any daemon on it...
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks in advance,
>>>> >>>>>>> >
>>>> >>>>>>> > ESGLInux,
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --
>>>> >>>>>>> Harsh J
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>
>>>> >>
>>>>
>>>
>>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
No, ZooKeeper daemons == http://zookeeper.apache.org.


On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:

> Hi Harsh,
>
> Now I´m confussed at all :-))))
>
> as you pointed ZKFC runs only in the NN. That´s looks right.
>
> So, what are ZK peers (the odd number I´m looking for) and where I have to
> run them? on another 3 nodes?
>
> As I can read from the previous url:
>
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes. Since ZooKeeper itself has light resource requirements, it
> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
> HDFS NameNode and Standby Node. Many operators choose to deploy the third
> ZooKeeper process on the same node as the YARN ResourceManager. It is
> advisable to configure the ZooKeeper nodes to store their data on separate
> disk drives from the HDFS metadata for best performance and isolation.
>
> Here,  ZooKeeper daemons = ZKFC?
>
>
> Thanks
>
> ESGLinux,
>
>
>
> 2013/1/15 Harsh J <ha...@cloudera.com>
>
>> Hi,
>>
>> I fail to see your confusion.
>>
>> ZKFC != ZK
>>
>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>> numbers, such as JNs are to be.
>>
>> ZKFC is something the NN needs for its Automatic Failover capability. It
>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>
>>
>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I´m only testing the new HA feature. I´m not in a production system,
>>>
>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>
>>> In this url:
>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>
>>> you can read:
>>> If you have configured automatic failover using the ZooKeeper
>>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>>> each of the machines that runs a NameNode.
>>>
>>> So, the number of ZKFC daemons are two, but reading this url:
>>>
>>>
>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>
>>> you can read this:
>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>> three or five nodes
>>>
>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>> (active+standby).
>>>
>>> So I´m a bit confussed with this deployment...
>>>
>>> Any suggestion?
>>>
>>> Thanks in advance for all your answers
>>>
>>> Kind regards,
>>>
>>> ESGLinux
>>>
>>>
>>>
>>>
>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>
>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>> wrote:
>>>> > Hi ESGLinux,
>>>> >
>>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>> > if you like, though.
>>>>
>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>> active NN node and the standby NN node.
>>>>
>>>> Colin
>>>>
>>>> >
>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>> > you just want to try something out, you can run everything on the same
>>>> > node if you want.  It depends on what you're trying to do.
>>>> >
>>>> > cheers,
>>>> > Colin
>>>> >
>>>> >
>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >> Thank you for your answer Craig,
>>>> >>
>>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>>> need;-)
>>>> >>
>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>> where to
>>>> >> ask for explications :-)
>>>> >>
>>>> >> ESGLinux
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>
>>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>>  If
>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>> >>>
>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hi Craig,
>>>> >>>>
>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>> >>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>> >>>>
>>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>>> can
>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>> for example
>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>> etc.) so the
>>>> >>>> JournalNodes' local directories can use the reliable local storage
>>>> on those
>>>> >>>> machines.
>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>> >>>> modifications must be written to a majority of JournalNodes
>>>> >>>>
>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>> >>>>
>>>> >>>>
>>>> >>>> Thanks for your answer,
>>>> >>>>
>>>> >>>> ESGLinux,
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>>>
>>>> >>>>> You need the following:
>>>> >>>>>
>>>> >>>>> - active namenode + zkfc
>>>> >>>>> - standby namenode + zkfc
>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>> >>>>>
>>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>>> not be
>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>> the journal
>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>> Harsh says you
>>>> >>>>> could co-locate them on dedicated hosts.
>>>> >>>>>
>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>> >>>>>
>>>> >>>>> Regards,
>>>> >>>>> Craig
>>>> >>>>>
>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>> this
>>>> >>>>>> way:
>>>> >>>>>>
>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>> >>>>>>
>>>> >>>>>> Is this right?
>>>> >>>>>>
>>>> >>>>>> Thanks in advance,
>>>> >>>>>>
>>>> >>>>>> ESGLinux,
>>>> >>>>>>
>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>> >>>>>>>
>>>> >>>>>>> Hi,
>>>> >>>>>>>
>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>> Quorum
>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>> Controller,
>>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>>> NNs).
>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>> shared
>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>> >>>>>>>
>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>> for
>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>>> you
>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>> as
>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>> >>>>>>> quorum).
>>>> >>>>>>>
>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >>>>>>> > Hi all,
>>>> >>>>>>> >
>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>> >>>>>>> > cluster,
>>>> >>>>>>> >
>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>> ZooKeeper
>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>> this way:
>>>> >>>>>>> >
>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>> >>>>>>> >
>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>> that runs
>>>> >>>>>>> > a
>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>> does the
>>>> >>>>>>> > third
>>>> >>>>>>> > daemon?
>>>> >>>>>>> >
>>>> >>>>>>> > as I read from this url:
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>> >>>>>>> >
>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>> monitoring -
>>>> >>>>>>> > the ZKFC
>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>> health-check
>>>> >>>>>>> > command.)
>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>>> could
>>>> >>>>>>> > use
>>>> >>>>>>> > another node without any daemon on it...
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks in advance,
>>>> >>>>>>> >
>>>> >>>>>>> > ESGLInux,
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --
>>>> >>>>>>> Harsh J
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>
>>>> >>
>>>>
>>>
>>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
No, ZooKeeper daemons == http://zookeeper.apache.org.


On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <es...@gmail.com> wrote:

> Hi Harsh,
>
> Now I´m confussed at all :-))))
>
> as you pointed ZKFC runs only in the NN. That´s looks right.
>
> So, what are ZK peers (the odd number I´m looking for) and where I have to
> run them? on another 3 nodes?
>
> As I can read from the previous url:
>
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes. Since ZooKeeper itself has light resource requirements, it
> is acceptable to collocate the ZooKeeper nodes on the same hardware as the
> HDFS NameNode and Standby Node. Many operators choose to deploy the third
> ZooKeeper process on the same node as the YARN ResourceManager. It is
> advisable to configure the ZooKeeper nodes to store their data on separate
> disk drives from the HDFS metadata for best performance and isolation.
>
> Here,  ZooKeeper daemons = ZKFC?
>
>
> Thanks
>
> ESGLinux,
>
>
>
> 2013/1/15 Harsh J <ha...@cloudera.com>
>
>> Hi,
>>
>> I fail to see your confusion.
>>
>> ZKFC != ZK
>>
>> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
>> numbers, such as JNs are to be.
>>
>> ZKFC is something the NN needs for its Automatic Failover capability. It
>> is a client to ZK and thereby demands ZK's presence; for which the odd # of
>> nodes is suggested. ZKFC itself is only to be run one per NN.
>>
>>
>> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I´m only testing the new HA feature. I´m not in a production system,
>>>
>>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>>
>>> In this url:
>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>>
>>> you can read:
>>> If you have configured automatic failover using the ZooKeeper
>>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>>> each of the machines that runs a NameNode.
>>>
>>> So, the number of ZKFC daemons are two, but reading this url:
>>>
>>>
>>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>>
>>> you can read this:
>>> In a typical deployment, ZooKeeper daemons are configured to run on
>>> three or five nodes
>>>
>>> I think that to ensure a good HA enviroment (of any kind) you need and
>>> odd number of nodes to avoid split-brain. The problem I see here is that If
>>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>>> (active+standby).
>>>
>>> So I´m a bit confussed with this deployment...
>>>
>>> Any suggestion?
>>>
>>> Thanks in advance for all your answers
>>>
>>> Kind regards,
>>>
>>> ESGLinux
>>>
>>>
>>>
>>>
>>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>>
>>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>>> wrote:
>>>> > Hi ESGLinux,
>>>> >
>>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>>> > if you like, though.
>>>>
>>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>>> active NN node and the standby NN node.
>>>>
>>>> Colin
>>>>
>>>> >
>>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>>> > you just want to try something out, you can run everything on the same
>>>> > node if you want.  It depends on what you're trying to do.
>>>> >
>>>> > cheers,
>>>> > Colin
>>>> >
>>>> >
>>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >> Thank you for your answer Craig,
>>>> >>
>>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>>> need;-)
>>>> >>
>>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>>> where to
>>>> >> ask for explications :-)
>>>> >>
>>>> >> ESGLinux
>>>> >>
>>>> >>
>>>> >>
>>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>
>>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>>  If
>>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>>> >>>
>>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hi Craig,
>>>> >>>>
>>>> >>>> I´m a bit confused, I have read this from cloudera:
>>>> >>>>
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>> >>>>
>>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>>> can
>>>> >>>> reasonably be collocated on machines with other Hadoop daemons,
>>>> for example
>>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>>> etc.) so the
>>>> >>>> JournalNodes' local directories can use the reliable local storage
>>>> on those
>>>> >>>> machines.
>>>> >>>> There must be at least three JournalNode daemons, since edit log
>>>> >>>> modifications must be written to a majority of JournalNodes
>>>> >>>>
>>>> >>>> as you can read they recommend to put journalnode daemons with the
>>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>>> >>>>
>>>> >>>>
>>>> >>>> Thanks for your answer,
>>>> >>>>
>>>> >>>> ESGLinux,
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>> >>>>>
>>>> >>>>> You need the following:
>>>> >>>>>
>>>> >>>>> - active namenode + zkfc
>>>> >>>>> - standby namenode + zkfc
>>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>> >>>>>
>>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>>> not be
>>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>>> the journal
>>>> >>>>> and zookeeper nodes across the hosts running datanodes or as
>>>> Harsh says you
>>>> >>>>> could co-locate them on dedicated hosts.
>>>> >>>>>
>>>> >>>>> ZKFC does not monitor the JobTracker.
>>>> >>>>>
>>>> >>>>> Regards,
>>>> >>>>> Craig
>>>> >>>>>
>>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Hi,
>>>> >>>>>>
>>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>>> this
>>>> >>>>>> way:
>>>> >>>>>>
>>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>> >>>>>>
>>>> >>>>>> Is this right?
>>>> >>>>>>
>>>> >>>>>> Thanks in advance,
>>>> >>>>>>
>>>> >>>>>> ESGLinux,
>>>> >>>>>>
>>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>> >>>>>>>
>>>> >>>>>>> Hi,
>>>> >>>>>>>
>>>> >>>>>>> There are two different things here: Automatic Failover and
>>>> Quorum
>>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>>> Controller,
>>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>>> NNs).
>>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>>> shared
>>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>>> >>>>>>>
>>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>>> for
>>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>>> you
>>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those
>>>> as
>>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>> >>>>>>> quorum).
>>>> >>>>>>>
>>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>>> wrote:
>>>> >>>>>>> > Hi all,
>>>> >>>>>>> >
>>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>> >>>>>>> > cluster,
>>>> >>>>>>> >
>>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>>> ZooKeeper
>>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons
>>>> this way:
>>>> >>>>>>> >
>>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>> >>>>>>> >
>>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes
>>>> that runs
>>>> >>>>>>> > a
>>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what
>>>> does the
>>>> >>>>>>> > third
>>>> >>>>>>> > daemon?
>>>> >>>>>>> >
>>>> >>>>>>> > as I read from this url:
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>> >>>>>>> >
>>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>>> monitoring -
>>>> >>>>>>> > the ZKFC
>>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>>> health-check
>>>> >>>>>>> > command.)
>>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>>> could
>>>> >>>>>>> > use
>>>> >>>>>>> > another node without any daemon on it...
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks in advance,
>>>> >>>>>>> >
>>>> >>>>>>> > ESGLInux,
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> --
>>>> >>>>>>> Harsh J
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>
>>>> >>
>>>>
>>>
>>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi Harsh,

Now I´m confussed at all :-))))

as you pointed ZKFC runs only in the NN. That´s looks right.

So, what are ZK peers (the odd number I´m looking for) and where I have to
run them? on another 3 nodes?

As I can read from the previous url:

In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes. Since ZooKeeper itself has light resource requirements, it
is acceptable to collocate the ZooKeeper nodes on the same hardware as the
HDFS NameNode and Standby Node. Many operators choose to deploy the third
ZooKeeper process on the same node as the YARN ResourceManager. It is
advisable to configure the ZooKeeper nodes to store their data on separate
disk drives from the HDFS metadata for best performance and isolation.

Here,  ZooKeeper daemons = ZKFC?


Thanks

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> Hi,
>
> I fail to see your confusion.
>
> ZKFC != ZK
>
> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
> numbers, such as JNs are to be.
>
> ZKFC is something the NN needs for its Automatic Failover capability. It
> is a client to ZK and thereby demands ZK's presence; for which the odd # of
> nodes is suggested. ZKFC itself is only to be run one per NN.
>
>
> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi all,
>>
>> I´m only testing the new HA feature. I´m not in a production system,
>>
>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>
>> In this url:
>>
>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>
>> you can read:
>> If you have configured automatic failover using the ZooKeeper
>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>> each of the machines that runs a NameNode.
>>
>> So, the number of ZKFC daemons are two, but reading this url:
>>
>>
>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>
>> you can read this:
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes
>>
>> I think that to ensure a good HA enviroment (of any kind) you need and
>> odd number of nodes to avoid split-brain. The problem I see here is that If
>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>> (active+standby).
>>
>> So I´m a bit confussed with this deployment...
>>
>> Any suggestion?
>>
>> Thanks in advance for all your answers
>>
>> Kind regards,
>>
>> ESGLinux
>>
>>
>>
>>
>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>
>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>> wrote:
>>> > Hi ESGLinux,
>>> >
>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>> > if you like, though.
>>>
>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>> active NN node and the standby NN node.
>>>
>>> Colin
>>>
>>> >
>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>> > you just want to try something out, you can run everything on the same
>>> > node if you want.  It depends on what you're trying to do.
>>> >
>>> > cheers,
>>> > Colin
>>> >
>>> >
>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>>> >> Thank you for your answer Craig,
>>> >>
>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>> need;-)
>>> >>
>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>> where to
>>> >> ask for explications :-)
>>> >>
>>> >> ESGLinux
>>> >>
>>> >>
>>> >>
>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>
>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>  If
>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>> >>>
>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hi Craig,
>>> >>>>
>>> >>>> I´m a bit confused, I have read this from cloudera:
>>> >>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>> >>>>
>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>> can
>>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>>> example
>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>> etc.) so the
>>> >>>> JournalNodes' local directories can use the reliable local storage
>>> on those
>>> >>>> machines.
>>> >>>> There must be at least three JournalNode daemons, since edit log
>>> >>>> modifications must be written to a majority of JournalNodes
>>> >>>>
>>> >>>> as you can read they recommend to put journalnode daemons with the
>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>> >>>>
>>> >>>>
>>> >>>> Thanks for your answer,
>>> >>>>
>>> >>>> ESGLinux,
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>>>
>>> >>>>> You need the following:
>>> >>>>>
>>> >>>>> - active namenode + zkfc
>>> >>>>> - standby namenode + zkfc
>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>> >>>>>
>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>> not be
>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>> the journal
>>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>>> says you
>>> >>>>> could co-locate them on dedicated hosts.
>>> >>>>>
>>> >>>>> ZKFC does not monitor the JobTracker.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Craig
>>> >>>>>
>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>> this
>>> >>>>>> way:
>>> >>>>>>
>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>> >>>>>>
>>> >>>>>> Is this right?
>>> >>>>>>
>>> >>>>>> Thanks in advance,
>>> >>>>>>
>>> >>>>>> ESGLinux,
>>> >>>>>>
>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>> >>>>>>>
>>> >>>>>>> Hi,
>>> >>>>>>>
>>> >>>>>>> There are two different things here: Automatic Failover and
>>> Quorum
>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>> Controller,
>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>> NNs).
>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>> shared
>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>> >>>>>>>
>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>> for
>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>> you
>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>> >>>>>>> quorum).
>>> >>>>>>>
>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>> wrote:
>>> >>>>>>> > Hi all,
>>> >>>>>>> >
>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>> >>>>>>> > cluster,
>>> >>>>>>> >
>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>> ZooKeeper
>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>>> way:
>>> >>>>>>> >
>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>> >>>>>>> >
>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>>> runs
>>> >>>>>>> > a
>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>>> the
>>> >>>>>>> > third
>>> >>>>>>> > daemon?
>>> >>>>>>> >
>>> >>>>>>> > as I read from this url:
>>> >>>>>>> >
>>> >>>>>>> >
>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>> >>>>>>> >
>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>> monitoring -
>>> >>>>>>> > the ZKFC
>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>> health-check
>>> >>>>>>> > command.)
>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>> could
>>> >>>>>>> > use
>>> >>>>>>> > another node without any daemon on it...
>>> >>>>>>> >
>>> >>>>>>> > Thanks in advance,
>>> >>>>>>> >
>>> >>>>>>> > ESGLInux,
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Harsh J
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi Harsh,

Now I´m confussed at all :-))))

as you pointed ZKFC runs only in the NN. That´s looks right.

So, what are ZK peers (the odd number I´m looking for) and where I have to
run them? on another 3 nodes?

As I can read from the previous url:

In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes. Since ZooKeeper itself has light resource requirements, it
is acceptable to collocate the ZooKeeper nodes on the same hardware as the
HDFS NameNode and Standby Node. Many operators choose to deploy the third
ZooKeeper process on the same node as the YARN ResourceManager. It is
advisable to configure the ZooKeeper nodes to store their data on separate
disk drives from the HDFS metadata for best performance and isolation.

Here,  ZooKeeper daemons = ZKFC?


Thanks

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> Hi,
>
> I fail to see your confusion.
>
> ZKFC != ZK
>
> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
> numbers, such as JNs are to be.
>
> ZKFC is something the NN needs for its Automatic Failover capability. It
> is a client to ZK and thereby demands ZK's presence; for which the odd # of
> nodes is suggested. ZKFC itself is only to be run one per NN.
>
>
> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi all,
>>
>> I´m only testing the new HA feature. I´m not in a production system,
>>
>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>
>> In this url:
>>
>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>
>> you can read:
>> If you have configured automatic failover using the ZooKeeper
>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>> each of the machines that runs a NameNode.
>>
>> So, the number of ZKFC daemons are two, but reading this url:
>>
>>
>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>
>> you can read this:
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes
>>
>> I think that to ensure a good HA enviroment (of any kind) you need and
>> odd number of nodes to avoid split-brain. The problem I see here is that If
>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>> (active+standby).
>>
>> So I´m a bit confussed with this deployment...
>>
>> Any suggestion?
>>
>> Thanks in advance for all your answers
>>
>> Kind regards,
>>
>> ESGLinux
>>
>>
>>
>>
>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>
>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>> wrote:
>>> > Hi ESGLinux,
>>> >
>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>> > if you like, though.
>>>
>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>> active NN node and the standby NN node.
>>>
>>> Colin
>>>
>>> >
>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>> > you just want to try something out, you can run everything on the same
>>> > node if you want.  It depends on what you're trying to do.
>>> >
>>> > cheers,
>>> > Colin
>>> >
>>> >
>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>>> >> Thank you for your answer Craig,
>>> >>
>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>> need;-)
>>> >>
>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>> where to
>>> >> ask for explications :-)
>>> >>
>>> >> ESGLinux
>>> >>
>>> >>
>>> >>
>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>
>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>  If
>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>> >>>
>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hi Craig,
>>> >>>>
>>> >>>> I´m a bit confused, I have read this from cloudera:
>>> >>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>> >>>>
>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>> can
>>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>>> example
>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>> etc.) so the
>>> >>>> JournalNodes' local directories can use the reliable local storage
>>> on those
>>> >>>> machines.
>>> >>>> There must be at least three JournalNode daemons, since edit log
>>> >>>> modifications must be written to a majority of JournalNodes
>>> >>>>
>>> >>>> as you can read they recommend to put journalnode daemons with the
>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>> >>>>
>>> >>>>
>>> >>>> Thanks for your answer,
>>> >>>>
>>> >>>> ESGLinux,
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>>>
>>> >>>>> You need the following:
>>> >>>>>
>>> >>>>> - active namenode + zkfc
>>> >>>>> - standby namenode + zkfc
>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>> >>>>>
>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>> not be
>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>> the journal
>>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>>> says you
>>> >>>>> could co-locate them on dedicated hosts.
>>> >>>>>
>>> >>>>> ZKFC does not monitor the JobTracker.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Craig
>>> >>>>>
>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>> this
>>> >>>>>> way:
>>> >>>>>>
>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>> >>>>>>
>>> >>>>>> Is this right?
>>> >>>>>>
>>> >>>>>> Thanks in advance,
>>> >>>>>>
>>> >>>>>> ESGLinux,
>>> >>>>>>
>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>> >>>>>>>
>>> >>>>>>> Hi,
>>> >>>>>>>
>>> >>>>>>> There are two different things here: Automatic Failover and
>>> Quorum
>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>> Controller,
>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>> NNs).
>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>> shared
>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>> >>>>>>>
>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>> for
>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>> you
>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>> >>>>>>> quorum).
>>> >>>>>>>
>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>> wrote:
>>> >>>>>>> > Hi all,
>>> >>>>>>> >
>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>> >>>>>>> > cluster,
>>> >>>>>>> >
>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>> ZooKeeper
>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>>> way:
>>> >>>>>>> >
>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>> >>>>>>> >
>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>>> runs
>>> >>>>>>> > a
>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>>> the
>>> >>>>>>> > third
>>> >>>>>>> > daemon?
>>> >>>>>>> >
>>> >>>>>>> > as I read from this url:
>>> >>>>>>> >
>>> >>>>>>> >
>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>> >>>>>>> >
>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>> monitoring -
>>> >>>>>>> > the ZKFC
>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>> health-check
>>> >>>>>>> > command.)
>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>> could
>>> >>>>>>> > use
>>> >>>>>>> > another node without any daemon on it...
>>> >>>>>>> >
>>> >>>>>>> > Thanks in advance,
>>> >>>>>>> >
>>> >>>>>>> > ESGLInux,
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Harsh J
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi Harsh,

Now I´m confussed at all :-))))

as you pointed ZKFC runs only in the NN. That´s looks right.

So, what are ZK peers (the odd number I´m looking for) and where I have to
run them? on another 3 nodes?

As I can read from the previous url:

In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes. Since ZooKeeper itself has light resource requirements, it
is acceptable to collocate the ZooKeeper nodes on the same hardware as the
HDFS NameNode and Standby Node. Many operators choose to deploy the third
ZooKeeper process on the same node as the YARN ResourceManager. It is
advisable to configure the ZooKeeper nodes to store their data on separate
disk drives from the HDFS metadata for best performance and isolation.

Here,  ZooKeeper daemons = ZKFC?


Thanks

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> Hi,
>
> I fail to see your confusion.
>
> ZKFC != ZK
>
> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
> numbers, such as JNs are to be.
>
> ZKFC is something the NN needs for its Automatic Failover capability. It
> is a client to ZK and thereby demands ZK's presence; for which the odd # of
> nodes is suggested. ZKFC itself is only to be run one per NN.
>
>
> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi all,
>>
>> I´m only testing the new HA feature. I´m not in a production system,
>>
>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>
>> In this url:
>>
>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>
>> you can read:
>> If you have configured automatic failover using the ZooKeeper
>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>> each of the machines that runs a NameNode.
>>
>> So, the number of ZKFC daemons are two, but reading this url:
>>
>>
>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>
>> you can read this:
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes
>>
>> I think that to ensure a good HA enviroment (of any kind) you need and
>> odd number of nodes to avoid split-brain. The problem I see here is that If
>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>> (active+standby).
>>
>> So I´m a bit confussed with this deployment...
>>
>> Any suggestion?
>>
>> Thanks in advance for all your answers
>>
>> Kind regards,
>>
>> ESGLinux
>>
>>
>>
>>
>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>
>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>> wrote:
>>> > Hi ESGLinux,
>>> >
>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>> > if you like, though.
>>>
>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>> active NN node and the standby NN node.
>>>
>>> Colin
>>>
>>> >
>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>> > you just want to try something out, you can run everything on the same
>>> > node if you want.  It depends on what you're trying to do.
>>> >
>>> > cheers,
>>> > Colin
>>> >
>>> >
>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>>> >> Thank you for your answer Craig,
>>> >>
>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>> need;-)
>>> >>
>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>> where to
>>> >> ask for explications :-)
>>> >>
>>> >> ESGLinux
>>> >>
>>> >>
>>> >>
>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>
>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>  If
>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>> >>>
>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hi Craig,
>>> >>>>
>>> >>>> I´m a bit confused, I have read this from cloudera:
>>> >>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>> >>>>
>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>> can
>>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>>> example
>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>> etc.) so the
>>> >>>> JournalNodes' local directories can use the reliable local storage
>>> on those
>>> >>>> machines.
>>> >>>> There must be at least three JournalNode daemons, since edit log
>>> >>>> modifications must be written to a majority of JournalNodes
>>> >>>>
>>> >>>> as you can read they recommend to put journalnode daemons with the
>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>> >>>>
>>> >>>>
>>> >>>> Thanks for your answer,
>>> >>>>
>>> >>>> ESGLinux,
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>>>
>>> >>>>> You need the following:
>>> >>>>>
>>> >>>>> - active namenode + zkfc
>>> >>>>> - standby namenode + zkfc
>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>> >>>>>
>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>> not be
>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>> the journal
>>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>>> says you
>>> >>>>> could co-locate them on dedicated hosts.
>>> >>>>>
>>> >>>>> ZKFC does not monitor the JobTracker.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Craig
>>> >>>>>
>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>> this
>>> >>>>>> way:
>>> >>>>>>
>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>> >>>>>>
>>> >>>>>> Is this right?
>>> >>>>>>
>>> >>>>>> Thanks in advance,
>>> >>>>>>
>>> >>>>>> ESGLinux,
>>> >>>>>>
>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>> >>>>>>>
>>> >>>>>>> Hi,
>>> >>>>>>>
>>> >>>>>>> There are two different things here: Automatic Failover and
>>> Quorum
>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>> Controller,
>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>> NNs).
>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>> shared
>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>> >>>>>>>
>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>> for
>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>> you
>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>> >>>>>>> quorum).
>>> >>>>>>>
>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>> wrote:
>>> >>>>>>> > Hi all,
>>> >>>>>>> >
>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>> >>>>>>> > cluster,
>>> >>>>>>> >
>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>> ZooKeeper
>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>>> way:
>>> >>>>>>> >
>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>> >>>>>>> >
>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>>> runs
>>> >>>>>>> > a
>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>>> the
>>> >>>>>>> > third
>>> >>>>>>> > daemon?
>>> >>>>>>> >
>>> >>>>>>> > as I read from this url:
>>> >>>>>>> >
>>> >>>>>>> >
>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>> >>>>>>> >
>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>> monitoring -
>>> >>>>>>> > the ZKFC
>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>> health-check
>>> >>>>>>> > command.)
>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>> could
>>> >>>>>>> > use
>>> >>>>>>> > another node without any daemon on it...
>>> >>>>>>> >
>>> >>>>>>> > Thanks in advance,
>>> >>>>>>> >
>>> >>>>>>> > ESGLInux,
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Harsh J
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi Harsh,

Now I´m confussed at all :-))))

as you pointed ZKFC runs only in the NN. That´s looks right.

So, what are ZK peers (the odd number I´m looking for) and where I have to
run them? on another 3 nodes?

As I can read from the previous url:

In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes. Since ZooKeeper itself has light resource requirements, it
is acceptable to collocate the ZooKeeper nodes on the same hardware as the
HDFS NameNode and Standby Node. Many operators choose to deploy the third
ZooKeeper process on the same node as the YARN ResourceManager. It is
advisable to configure the ZooKeeper nodes to store their data on separate
disk drives from the HDFS metadata for best performance and isolation.

Here,  ZooKeeper daemons = ZKFC?


Thanks

ESGLinux,



2013/1/15 Harsh J <ha...@cloudera.com>

> Hi,
>
> I fail to see your confusion.
>
> ZKFC != ZK
>
> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
> numbers, such as JNs are to be.
>
> ZKFC is something the NN needs for its Automatic Failover capability. It
> is a client to ZK and thereby demands ZK's presence; for which the odd # of
> nodes is suggested. ZKFC itself is only to be run one per NN.
>
>
> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:
>
>> Hi all,
>>
>> I´m only testing the new HA feature. I´m not in a production system,
>>
>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>
>> In this url:
>>
>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>
>> you can read:
>> If you have configured automatic failover using the ZooKeeper
>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>> each of the machines that runs a NameNode.
>>
>> So, the number of ZKFC daemons are two, but reading this url:
>>
>>
>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>
>> you can read this:
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes
>>
>> I think that to ensure a good HA enviroment (of any kind) you need and
>> odd number of nodes to avoid split-brain. The problem I see here is that If
>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>> (active+standby).
>>
>> So I´m a bit confussed with this deployment...
>>
>> Any suggestion?
>>
>> Thanks in advance for all your answers
>>
>> Kind regards,
>>
>> ESGLinux
>>
>>
>>
>>
>> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>>
>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>>> wrote:
>>> > Hi ESGLinux,
>>> >
>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>> > if you like, though.
>>>
>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>> active NN node and the standby NN node.
>>>
>>> Colin
>>>
>>> >
>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>> > you just want to try something out, you can run everything on the same
>>> > node if you want.  It depends on what you're trying to do.
>>> >
>>> > cheers,
>>> > Colin
>>> >
>>> >
>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>>> >> Thank you for your answer Craig,
>>> >>
>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>> need;-)
>>> >>
>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>> where to
>>> >> ask for explications :-)
>>> >>
>>> >> ESGLinux
>>> >>
>>> >>
>>> >>
>>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>
>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>  If
>>> >>> that's what Cloudera recommends then I'm sure it's fine.
>>> >>>
>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hi Craig,
>>> >>>>
>>> >>>> I´m a bit confused, I have read this from cloudera:
>>> >>>>
>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>> >>>>
>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>>> can
>>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>>> example
>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>>> etc.) so the
>>> >>>> JournalNodes' local directories can use the reliable local storage
>>> on those
>>> >>>> machines.
>>> >>>> There must be at least three JournalNode daemons, since edit log
>>> >>>> modifications must be written to a majority of JournalNodes
>>> >>>>
>>> >>>> as you can read they recommend to put journalnode daemons with the
>>> >>>> namenodes, but you say the opposite.??¿?¿??
>>> >>>>
>>> >>>>
>>> >>>> Thanks for your answer,
>>> >>>>
>>> >>>> ESGLinux,
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>> >>>>>
>>> >>>>> You need the following:
>>> >>>>>
>>> >>>>> - active namenode + zkfc
>>> >>>>> - standby namenode + zkfc
>>> >>>>> - pool of journal nodes (odd number, 3 or more)
>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>> >>>>>
>>> >>>>> As the journal nodes hold the namesystem transactions they should
>>> not be
>>> >>>>> co-located with the namenodes in case of failure.  I distribute
>>> the journal
>>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>>> says you
>>> >>>>> could co-locate them on dedicated hosts.
>>> >>>>>
>>> >>>>> ZKFC does not monitor the JobTracker.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Craig
>>> >>>>>
>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>>> this
>>> >>>>>> way:
>>> >>>>>>
>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>> >>>>>>
>>> >>>>>> Is this right?
>>> >>>>>>
>>> >>>>>> Thanks in advance,
>>> >>>>>>
>>> >>>>>> ESGLinux,
>>> >>>>>>
>>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>> >>>>>>>
>>> >>>>>>> Hi,
>>> >>>>>>>
>>> >>>>>>> There are two different things here: Automatic Failover and
>>> Quorum
>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>>> Controller,
>>> >>>>>>> is to manage failovers automatically (based on health checks of
>>> NNs).
>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of
>>> shared
>>> >>>>>>> storage for namesystem transactions that helps enable HA.
>>> >>>>>>>
>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes
>>> for
>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>>> you
>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>> >>>>>>> quorum).
>>> >>>>>>>
>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>>> wrote:
>>> >>>>>>> > Hi all,
>>> >>>>>>> >
>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>> >>>>>>> > cluster,
>>> >>>>>>> >
>>> >>>>>>> > As far as I know, I need at least three nodes to run three
>>> ZooKeeper
>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>>> way:
>>> >>>>>>> >
>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>> >>>>>>> >
>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>>> runs
>>> >>>>>>> > a
>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>>> the
>>> >>>>>>> > third
>>> >>>>>>> > daemon?
>>> >>>>>>> >
>>> >>>>>>> > as I read from this url:
>>> >>>>>>> >
>>> >>>>>>> >
>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>> >>>>>>> >
>>> >>>>>>> > this daemons are only related with NameNodes, (Health
>>> monitoring -
>>> >>>>>>> > the ZKFC
>>> >>>>>>> > pings its local NameNode on a periodic basis with a
>>> health-check
>>> >>>>>>> > command.)
>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>>> could
>>> >>>>>>> > use
>>> >>>>>>> > another node without any daemon on it...
>>> >>>>>>> >
>>> >>>>>>> > Thanks in advance,
>>> >>>>>>> >
>>> >>>>>>> > ESGLInux,
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Harsh J
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>
>>>
>>
>>
>
>
> --
> Harsh J
>

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
Hi,

I fail to see your confusion.

ZKFC != ZK

ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
numbers, such as JNs are to be.

ZKFC is something the NN needs for its Automatic Failover capability. It is
a client to ZK and thereby demands ZK's presence; for which the odd # of
nodes is suggested. ZKFC itself is only to be run one per NN.


On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:

> Hi all,
>
> I´m only testing the new HA feature. I´m not in a production system,
>
> Well, let´s talk about the number of nodes and the ZKFC daemons.
>
> In this url:
>
> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>
> you can read:
> If you have configured automatic failover using the ZooKeeper
> FailoverController (ZKFC), you must install and start thezkfc daemon on
> each of the machines that runs a NameNode.
>
> So, the number of ZKFC daemons are two, but reading this url:
>
>
> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>
> you can read this:
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes
>
> I think that to ensure a good HA enviroment (of any kind) you need and odd
> number of nodes to avoid split-brain. The problem I see here is that If
> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
> (active+standby).
>
> So I´m a bit confussed with this deployment...
>
> Any suggestion?
>
> Thanks in advance for all your answers
>
> Kind regards,
>
> ESGLinux
>
>
>
>
> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>
>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>> wrote:
>> > Hi ESGLinux,
>> >
>> > In production, you need to run QJM on at least 3 nodes.  You also need
>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>> > if you like, though.
>>
>> Er, this should read "You also need to run ZooKeeper on at least 3
>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>> active NN node and the standby NN node.
>>
>> Colin
>>
>> >
>> > Of course, none of this is "needed" to set up an example cluster.  If
>> > you just want to try something out, you can run everything on the same
>> > node if you want.  It depends on what you're trying to do.
>> >
>> > cheers,
>> > Colin
>> >
>> >
>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> >> Thank you for your answer Craig,
>> >>
>> >> I´m planning my cluster and for now I´m not sure how many machines I
>> need;-)
>> >>
>> >> If I have doubt i´ll what clouder say and If have a problem I have
>> where to
>> >> ask for explications :-)
>> >>
>> >> ESGLinux
>> >>
>> >>
>> >>
>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>
>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>  If
>> >>> that's what Cloudera recommends then I'm sure it's fine.
>> >>>
>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>
>> >>>> Hi Craig,
>> >>>>
>> >>>> I´m a bit confused, I have read this from cloudera:
>> >>>>
>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>> >>>>
>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>> can
>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>> example
>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>> etc.) so the
>> >>>> JournalNodes' local directories can use the reliable local storage
>> on those
>> >>>> machines.
>> >>>> There must be at least three JournalNode daemons, since edit log
>> >>>> modifications must be written to a majority of JournalNodes
>> >>>>
>> >>>> as you can read they recommend to put journalnode daemons with the
>> >>>> namenodes, but you say the opposite.??¿?¿??
>> >>>>
>> >>>>
>> >>>> Thanks for your answer,
>> >>>>
>> >>>> ESGLinux,
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>>>
>> >>>>> You need the following:
>> >>>>>
>> >>>>> - active namenode + zkfc
>> >>>>> - standby namenode + zkfc
>> >>>>> - pool of journal nodes (odd number, 3 or more)
>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>> >>>>>
>> >>>>> As the journal nodes hold the namesystem transactions they should
>> not be
>> >>>>> co-located with the namenodes in case of failure.  I distribute the
>> journal
>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>> says you
>> >>>>> could co-locate them on dedicated hosts.
>> >>>>>
>> >>>>> ZKFC does not monitor the JobTracker.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Craig
>> >>>>>
>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>> this
>> >>>>>> way:
>> >>>>>>
>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>> >>>>>>
>> >>>>>> Is this right?
>> >>>>>>
>> >>>>>> Thanks in advance,
>> >>>>>>
>> >>>>>> ESGLinux,
>> >>>>>>
>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>> >>>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> There are two different things here: Automatic Failover and Quorum
>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>> Controller,
>> >>>>>>> is to manage failovers automatically (based on health checks of
>> NNs).
>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>> >>>>>>> storage for namesystem transactions that helps enable HA.
>> >>>>>>>
>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>> you
>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>> >>>>>>> quorum).
>> >>>>>>>
>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>> wrote:
>> >>>>>>> > Hi all,
>> >>>>>>> >
>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>> >>>>>>> > cluster,
>> >>>>>>> >
>> >>>>>>> > As far as I know, I need at least three nodes to run three
>> ZooKeeper
>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>> way:
>> >>>>>>> >
>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>> >>>>>>> >
>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>> runs
>> >>>>>>> > a
>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>> the
>> >>>>>>> > third
>> >>>>>>> > daemon?
>> >>>>>>> >
>> >>>>>>> > as I read from this url:
>> >>>>>>> >
>> >>>>>>> >
>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>> >>>>>>> >
>> >>>>>>> > this daemons are only related with NameNodes, (Health
>> monitoring -
>> >>>>>>> > the ZKFC
>> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
>> >>>>>>> > command.)
>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>> could
>> >>>>>>> > use
>> >>>>>>> > another node without any daemon on it...
>> >>>>>>> >
>> >>>>>>> > Thanks in advance,
>> >>>>>>> >
>> >>>>>>> > ESGLInux,
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Harsh J
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
Hi,

I fail to see your confusion.

ZKFC != ZK

ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
numbers, such as JNs are to be.

ZKFC is something the NN needs for its Automatic Failover capability. It is
a client to ZK and thereby demands ZK's presence; for which the odd # of
nodes is suggested. ZKFC itself is only to be run one per NN.


On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:

> Hi all,
>
> I´m only testing the new HA feature. I´m not in a production system,
>
> Well, let´s talk about the number of nodes and the ZKFC daemons.
>
> In this url:
>
> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>
> you can read:
> If you have configured automatic failover using the ZooKeeper
> FailoverController (ZKFC), you must install and start thezkfc daemon on
> each of the machines that runs a NameNode.
>
> So, the number of ZKFC daemons are two, but reading this url:
>
>
> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>
> you can read this:
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes
>
> I think that to ensure a good HA enviroment (of any kind) you need and odd
> number of nodes to avoid split-brain. The problem I see here is that If
> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
> (active+standby).
>
> So I´m a bit confussed with this deployment...
>
> Any suggestion?
>
> Thanks in advance for all your answers
>
> Kind regards,
>
> ESGLinux
>
>
>
>
> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>
>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>> wrote:
>> > Hi ESGLinux,
>> >
>> > In production, you need to run QJM on at least 3 nodes.  You also need
>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>> > if you like, though.
>>
>> Er, this should read "You also need to run ZooKeeper on at least 3
>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>> active NN node and the standby NN node.
>>
>> Colin
>>
>> >
>> > Of course, none of this is "needed" to set up an example cluster.  If
>> > you just want to try something out, you can run everything on the same
>> > node if you want.  It depends on what you're trying to do.
>> >
>> > cheers,
>> > Colin
>> >
>> >
>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> >> Thank you for your answer Craig,
>> >>
>> >> I´m planning my cluster and for now I´m not sure how many machines I
>> need;-)
>> >>
>> >> If I have doubt i´ll what clouder say and If have a problem I have
>> where to
>> >> ask for explications :-)
>> >>
>> >> ESGLinux
>> >>
>> >>
>> >>
>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>
>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>  If
>> >>> that's what Cloudera recommends then I'm sure it's fine.
>> >>>
>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>
>> >>>> Hi Craig,
>> >>>>
>> >>>> I´m a bit confused, I have read this from cloudera:
>> >>>>
>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>> >>>>
>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>> can
>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>> example
>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>> etc.) so the
>> >>>> JournalNodes' local directories can use the reliable local storage
>> on those
>> >>>> machines.
>> >>>> There must be at least three JournalNode daemons, since edit log
>> >>>> modifications must be written to a majority of JournalNodes
>> >>>>
>> >>>> as you can read they recommend to put journalnode daemons with the
>> >>>> namenodes, but you say the opposite.??¿?¿??
>> >>>>
>> >>>>
>> >>>> Thanks for your answer,
>> >>>>
>> >>>> ESGLinux,
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>>>
>> >>>>> You need the following:
>> >>>>>
>> >>>>> - active namenode + zkfc
>> >>>>> - standby namenode + zkfc
>> >>>>> - pool of journal nodes (odd number, 3 or more)
>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>> >>>>>
>> >>>>> As the journal nodes hold the namesystem transactions they should
>> not be
>> >>>>> co-located with the namenodes in case of failure.  I distribute the
>> journal
>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>> says you
>> >>>>> could co-locate them on dedicated hosts.
>> >>>>>
>> >>>>> ZKFC does not monitor the JobTracker.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Craig
>> >>>>>
>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>> this
>> >>>>>> way:
>> >>>>>>
>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>> >>>>>>
>> >>>>>> Is this right?
>> >>>>>>
>> >>>>>> Thanks in advance,
>> >>>>>>
>> >>>>>> ESGLinux,
>> >>>>>>
>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>> >>>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> There are two different things here: Automatic Failover and Quorum
>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>> Controller,
>> >>>>>>> is to manage failovers automatically (based on health checks of
>> NNs).
>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>> >>>>>>> storage for namesystem transactions that helps enable HA.
>> >>>>>>>
>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>> you
>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>> >>>>>>> quorum).
>> >>>>>>>
>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>> wrote:
>> >>>>>>> > Hi all,
>> >>>>>>> >
>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>> >>>>>>> > cluster,
>> >>>>>>> >
>> >>>>>>> > As far as I know, I need at least three nodes to run three
>> ZooKeeper
>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>> way:
>> >>>>>>> >
>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>> >>>>>>> >
>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>> runs
>> >>>>>>> > a
>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>> the
>> >>>>>>> > third
>> >>>>>>> > daemon?
>> >>>>>>> >
>> >>>>>>> > as I read from this url:
>> >>>>>>> >
>> >>>>>>> >
>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>> >>>>>>> >
>> >>>>>>> > this daemons are only related with NameNodes, (Health
>> monitoring -
>> >>>>>>> > the ZKFC
>> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
>> >>>>>>> > command.)
>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>> could
>> >>>>>>> > use
>> >>>>>>> > another node without any daemon on it...
>> >>>>>>> >
>> >>>>>>> > Thanks in advance,
>> >>>>>>> >
>> >>>>>>> > ESGLInux,
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Harsh J
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
Hi,

I fail to see your confusion.

ZKFC != ZK

ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
numbers, such as JNs are to be.

ZKFC is something the NN needs for its Automatic Failover capability. It is
a client to ZK and thereby demands ZK's presence; for which the odd # of
nodes is suggested. ZKFC itself is only to be run one per NN.


On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:

> Hi all,
>
> I´m only testing the new HA feature. I´m not in a production system,
>
> Well, let´s talk about the number of nodes and the ZKFC daemons.
>
> In this url:
>
> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>
> you can read:
> If you have configured automatic failover using the ZooKeeper
> FailoverController (ZKFC), you must install and start thezkfc daemon on
> each of the machines that runs a NameNode.
>
> So, the number of ZKFC daemons are two, but reading this url:
>
>
> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>
> you can read this:
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes
>
> I think that to ensure a good HA enviroment (of any kind) you need and odd
> number of nodes to avoid split-brain. The problem I see here is that If
> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
> (active+standby).
>
> So I´m a bit confussed with this deployment...
>
> Any suggestion?
>
> Thanks in advance for all your answers
>
> Kind regards,
>
> ESGLinux
>
>
>
>
> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>
>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>> wrote:
>> > Hi ESGLinux,
>> >
>> > In production, you need to run QJM on at least 3 nodes.  You also need
>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>> > if you like, though.
>>
>> Er, this should read "You also need to run ZooKeeper on at least 3
>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>> active NN node and the standby NN node.
>>
>> Colin
>>
>> >
>> > Of course, none of this is "needed" to set up an example cluster.  If
>> > you just want to try something out, you can run everything on the same
>> > node if you want.  It depends on what you're trying to do.
>> >
>> > cheers,
>> > Colin
>> >
>> >
>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> >> Thank you for your answer Craig,
>> >>
>> >> I´m planning my cluster and for now I´m not sure how many machines I
>> need;-)
>> >>
>> >> If I have doubt i´ll what clouder say and If have a problem I have
>> where to
>> >> ask for explications :-)
>> >>
>> >> ESGLinux
>> >>
>> >>
>> >>
>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>
>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>  If
>> >>> that's what Cloudera recommends then I'm sure it's fine.
>> >>>
>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>
>> >>>> Hi Craig,
>> >>>>
>> >>>> I´m a bit confused, I have read this from cloudera:
>> >>>>
>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>> >>>>
>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>> can
>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>> example
>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>> etc.) so the
>> >>>> JournalNodes' local directories can use the reliable local storage
>> on those
>> >>>> machines.
>> >>>> There must be at least three JournalNode daemons, since edit log
>> >>>> modifications must be written to a majority of JournalNodes
>> >>>>
>> >>>> as you can read they recommend to put journalnode daemons with the
>> >>>> namenodes, but you say the opposite.??¿?¿??
>> >>>>
>> >>>>
>> >>>> Thanks for your answer,
>> >>>>
>> >>>> ESGLinux,
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>>>
>> >>>>> You need the following:
>> >>>>>
>> >>>>> - active namenode + zkfc
>> >>>>> - standby namenode + zkfc
>> >>>>> - pool of journal nodes (odd number, 3 or more)
>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>> >>>>>
>> >>>>> As the journal nodes hold the namesystem transactions they should
>> not be
>> >>>>> co-located with the namenodes in case of failure.  I distribute the
>> journal
>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>> says you
>> >>>>> could co-locate them on dedicated hosts.
>> >>>>>
>> >>>>> ZKFC does not monitor the JobTracker.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Craig
>> >>>>>
>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>> this
>> >>>>>> way:
>> >>>>>>
>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>> >>>>>>
>> >>>>>> Is this right?
>> >>>>>>
>> >>>>>> Thanks in advance,
>> >>>>>>
>> >>>>>> ESGLinux,
>> >>>>>>
>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>> >>>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> There are two different things here: Automatic Failover and Quorum
>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>> Controller,
>> >>>>>>> is to manage failovers automatically (based on health checks of
>> NNs).
>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>> >>>>>>> storage for namesystem transactions that helps enable HA.
>> >>>>>>>
>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>> you
>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>> >>>>>>> quorum).
>> >>>>>>>
>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>> wrote:
>> >>>>>>> > Hi all,
>> >>>>>>> >
>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>> >>>>>>> > cluster,
>> >>>>>>> >
>> >>>>>>> > As far as I know, I need at least three nodes to run three
>> ZooKeeper
>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>> way:
>> >>>>>>> >
>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>> >>>>>>> >
>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>> runs
>> >>>>>>> > a
>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>> the
>> >>>>>>> > third
>> >>>>>>> > daemon?
>> >>>>>>> >
>> >>>>>>> > as I read from this url:
>> >>>>>>> >
>> >>>>>>> >
>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>> >>>>>>> >
>> >>>>>>> > this daemons are only related with NameNodes, (Health
>> monitoring -
>> >>>>>>> > the ZKFC
>> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
>> >>>>>>> > command.)
>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>> could
>> >>>>>>> > use
>> >>>>>>> > another node without any daemon on it...
>> >>>>>>> >
>> >>>>>>> > Thanks in advance,
>> >>>>>>> >
>> >>>>>>> > ESGLInux,
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Harsh J
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by Harsh J <ha...@cloudera.com>.
Hi,

I fail to see your confusion.

ZKFC != ZK

ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
numbers, such as JNs are to be.

ZKFC is something the NN needs for its Automatic Failover capability. It is
a client to ZK and thereby demands ZK's presence; for which the odd # of
nodes is suggested. ZKFC itself is only to be run one per NN.


On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <es...@gmail.com> wrote:

> Hi all,
>
> I´m only testing the new HA feature. I´m not in a production system,
>
> Well, let´s talk about the number of nodes and the ZKFC daemons.
>
> In this url:
>
> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>
> you can read:
> If you have configured automatic failover using the ZooKeeper
> FailoverController (ZKFC), you must install and start thezkfc daemon on
> each of the machines that runs a NameNode.
>
> So, the number of ZKFC daemons are two, but reading this url:
>
>
> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>
> you can read this:
> In a typical deployment, ZooKeeper daemons are configured to run on three
> or five nodes
>
> I think that to ensure a good HA enviroment (of any kind) you need and odd
> number of nodes to avoid split-brain. The problem I see here is that If
> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
> (active+standby).
>
> So I´m a bit confussed with this deployment...
>
> Any suggestion?
>
> Thanks in advance for all your answers
>
> Kind regards,
>
> ESGLinux
>
>
>
>
> 2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>
>
>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
>> wrote:
>> > Hi ESGLinux,
>> >
>> > In production, you need to run QJM on at least 3 nodes.  You also need
>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>> > if you like, though.
>>
>> Er, this should read "You also need to run ZooKeeper on at least 3
>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>> active NN node and the standby NN node.
>>
>> Colin
>>
>> >
>> > Of course, none of this is "needed" to set up an example cluster.  If
>> > you just want to try something out, you can run everything on the same
>> > node if you want.  It depends on what you're trying to do.
>> >
>> > cheers,
>> > Colin
>> >
>> >
>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> >> Thank you for your answer Craig,
>> >>
>> >> I´m planning my cluster and for now I´m not sure how many machines I
>> need;-)
>> >>
>> >> If I have doubt i´ll what clouder say and If have a problem I have
>> where to
>> >> ask for explications :-)
>> >>
>> >> ESGLinux
>> >>
>> >>
>> >>
>> >> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>
>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>  If
>> >>> that's what Cloudera recommends then I'm sure it's fine.
>> >>>
>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>
>> >>>> Hi Craig,
>> >>>>
>> >>>> I´m a bit confused, I have read this from cloudera:
>> >>>>
>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>> >>>>
>> >>>> The JournalNode daemon is relatively lightweight, so these daemons
>> can
>> >>>> reasonably be collocated on machines with other Hadoop daemons, for
>> example
>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker,
>> etc.) so the
>> >>>> JournalNodes' local directories can use the reliable local storage
>> on those
>> >>>> machines.
>> >>>> There must be at least three JournalNode daemons, since edit log
>> >>>> modifications must be written to a majority of JournalNodes
>> >>>>
>> >>>> as you can read they recommend to put journalnode daemons with the
>> >>>> namenodes, but you say the opposite.??¿?¿??
>> >>>>
>> >>>>
>> >>>> Thanks for your answer,
>> >>>>
>> >>>> ESGLinux,
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>> >>>>>
>> >>>>> You need the following:
>> >>>>>
>> >>>>> - active namenode + zkfc
>> >>>>> - standby namenode + zkfc
>> >>>>> - pool of journal nodes (odd number, 3 or more)
>> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
>> >>>>>
>> >>>>> As the journal nodes hold the namesystem transactions they should
>> not be
>> >>>>> co-located with the namenodes in case of failure.  I distribute the
>> journal
>> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
>> says you
>> >>>>> could co-locate them on dedicated hosts.
>> >>>>>
>> >>>>> ZKFC does not monitor the JobTracker.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Craig
>> >>>>>
>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> well, If I have understand you I can configure my NN HA cluster
>> this
>> >>>>>> way:
>> >>>>>>
>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>> >>>>>>
>> >>>>>> Is this right?
>> >>>>>>
>> >>>>>> Thanks in advance,
>> >>>>>>
>> >>>>>> ESGLinux,
>> >>>>>>
>> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>> >>>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> There are two different things here: Automatic Failover and Quorum
>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
>> Controller,
>> >>>>>>> is to manage failovers automatically (based on health checks of
>> NNs).
>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>> >>>>>>> storage for namesystem transactions that helps enable HA.
>> >>>>>>>
>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like
>> you
>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>> >>>>>>> quorum).
>> >>>>>>>
>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
>> wrote:
>> >>>>>>> > Hi all,
>> >>>>>>> >
>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>> >>>>>>> > cluster,
>> >>>>>>> >
>> >>>>>>> > As far as I know, I need at least three nodes to run three
>> ZooKeeper
>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
>> way:
>> >>>>>>> >
>> >>>>>>> > - Active NameNode + 1 ZKFC daemon
>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>> >>>>>>> >
>> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
>> runs
>> >>>>>>> > a
>> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
>> the
>> >>>>>>> > third
>> >>>>>>> > daemon?
>> >>>>>>> >
>> >>>>>>> > as I read from this url:
>> >>>>>>> >
>> >>>>>>> >
>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>> >>>>>>> >
>> >>>>>>> > this daemons are only related with NameNodes, (Health
>> monitoring -
>> >>>>>>> > the ZKFC
>> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
>> >>>>>>> > command.)
>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
>> could
>> >>>>>>> > use
>> >>>>>>> > another node without any daemon on it...
>> >>>>>>> >
>> >>>>>>> > Thanks in advance,
>> >>>>>>> >
>> >>>>>>> > ESGLInux,
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Harsh J
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>
>>
>
>


-- 
Harsh J

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi all,

I´m only testing the new HA feature. I´m not in a production system,

Well, let´s talk about the number of nodes and the ZKFC daemons.

In this url:
https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover

you can read:
If you have configured automatic failover using the ZooKeeper
FailoverController (ZKFC), you must install and start thezkfc daemon on
each of the machines that runs a NameNode.

So, the number of ZKFC daemons are two, but reading this url:

http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper

you can read this:
In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes

I think that to ensure a good HA enviroment (of any kind) you need and odd
number of nodes to avoid split-brain. The problem I see here is that If
ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
(active+standby).

So I´m a bit confussed with this deployment...

Any suggestion?

Thanks in advance for all your answers

Kind regards,

ESGLinux




2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>

> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
> wrote:
> > Hi ESGLinux,
> >
> > In production, you need to run QJM on at least 3 nodes.  You also need
> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> > if you like, though.
>
> Er, this should read "You also need to run ZooKeeper on at least 3
> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
> active NN node and the standby NN node.
>
> Colin
>
> >
> > Of course, none of this is "needed" to set up an example cluster.  If
> > you just want to try something out, you can run everything on the same
> > node if you want.  It depends on what you're trying to do.
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
> >> Thank you for your answer Craig,
> >>
> >> I´m planning my cluster and for now I´m not sure how many machines I
> need;-)
> >>
> >> If I have doubt i´ll what clouder say and If have a problem I have
> where to
> >> ask for explications :-)
> >>
> >> ESGLinux
> >>
> >>
> >>
> >> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>
> >>> OK, I have reliable storage on my datanodes so not an issue for me.  If
> >>> that's what Cloudera recommends then I'm sure it's fine.
> >>>
> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>
> >>>> Hi Craig,
> >>>>
> >>>> I´m a bit confused, I have read this from cloudera:
> >>>>
> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
> >>>>
> >>>> The JournalNode daemon is relatively lightweight, so these daemons can
> >>>> reasonably be collocated on machines with other Hadoop daemons, for
> example
> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.)
> so the
> >>>> JournalNodes' local directories can use the reliable local storage on
> those
> >>>> machines.
> >>>> There must be at least three JournalNode daemons, since edit log
> >>>> modifications must be written to a majority of JournalNodes
> >>>>
> >>>> as you can read they recommend to put journalnode daemons with the
> >>>> namenodes, but you say the opposite.??¿?¿??
> >>>>
> >>>>
> >>>> Thanks for your answer,
> >>>>
> >>>> ESGLinux,
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>>>
> >>>>> You need the following:
> >>>>>
> >>>>> - active namenode + zkfc
> >>>>> - standby namenode + zkfc
> >>>>> - pool of journal nodes (odd number, 3 or more)
> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
> >>>>>
> >>>>> As the journal nodes hold the namesystem transactions they should
> not be
> >>>>> co-located with the namenodes in case of failure.  I distribute the
> journal
> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
> says you
> >>>>> could co-locate them on dedicated hosts.
> >>>>>
> >>>>> ZKFC does not monitor the JobTracker.
> >>>>>
> >>>>> Regards,
> >>>>> Craig
> >>>>>
> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> well, If I have understand you I can configure my NN HA cluster this
> >>>>>> way:
> >>>>>>
> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
> >>>>>>
> >>>>>> Is this right?
> >>>>>>
> >>>>>> Thanks in advance,
> >>>>>>
> >>>>>> ESGLinux,
> >>>>>>
> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> There are two different things here: Automatic Failover and Quorum
> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
> Controller,
> >>>>>>> is to manage failovers automatically (based on health checks of
> NNs).
> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
> >>>>>>> storage for namesystem transactions that helps enable HA.
> >>>>>>>
> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
> >>>>>>> reliable HA, preferably on nodes of their own if possible (like you
> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
> >>>>>>> quorum).
> >>>>>>>
> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
> wrote:
> >>>>>>> > Hi all,
> >>>>>>> >
> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
> >>>>>>> > cluster,
> >>>>>>> >
> >>>>>>> > As far as I know, I need at least three nodes to run three
> ZooKeeper
> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
> way:
> >>>>>>> >
> >>>>>>> > - Active NameNode + 1 ZKFC daemon
> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
> >>>>>>> >
> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
> runs
> >>>>>>> > a
> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
> the
> >>>>>>> > third
> >>>>>>> > daemon?
> >>>>>>> >
> >>>>>>> > as I read from this url:
> >>>>>>> >
> >>>>>>> >
> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
> >>>>>>> >
> >>>>>>> > this daemons are only related with NameNodes, (Health monitoring
> -
> >>>>>>> > the ZKFC
> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
> >>>>>>> > command.)
> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
> could
> >>>>>>> > use
> >>>>>>> > another node without any daemon on it...
> >>>>>>> >
> >>>>>>> > Thanks in advance,
> >>>>>>> >
> >>>>>>> > ESGLInux,
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Harsh J
> >>>>>>
> >>>>>>
> >>>>
> >>
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi all,

I´m only testing the new HA feature. I´m not in a production system,

Well, let´s talk about the number of nodes and the ZKFC daemons.

In this url:
https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover

you can read:
If you have configured automatic failover using the ZooKeeper
FailoverController (ZKFC), you must install and start thezkfc daemon on
each of the machines that runs a NameNode.

So, the number of ZKFC daemons are two, but reading this url:

http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper

you can read this:
In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes

I think that to ensure a good HA enviroment (of any kind) you need and odd
number of nodes to avoid split-brain. The problem I see here is that If
ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
(active+standby).

So I´m a bit confussed with this deployment...

Any suggestion?

Thanks in advance for all your answers

Kind regards,

ESGLinux




2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>

> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
> wrote:
> > Hi ESGLinux,
> >
> > In production, you need to run QJM on at least 3 nodes.  You also need
> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> > if you like, though.
>
> Er, this should read "You also need to run ZooKeeper on at least 3
> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
> active NN node and the standby NN node.
>
> Colin
>
> >
> > Of course, none of this is "needed" to set up an example cluster.  If
> > you just want to try something out, you can run everything on the same
> > node if you want.  It depends on what you're trying to do.
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
> >> Thank you for your answer Craig,
> >>
> >> I´m planning my cluster and for now I´m not sure how many machines I
> need;-)
> >>
> >> If I have doubt i´ll what clouder say and If have a problem I have
> where to
> >> ask for explications :-)
> >>
> >> ESGLinux
> >>
> >>
> >>
> >> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>
> >>> OK, I have reliable storage on my datanodes so not an issue for me.  If
> >>> that's what Cloudera recommends then I'm sure it's fine.
> >>>
> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>
> >>>> Hi Craig,
> >>>>
> >>>> I´m a bit confused, I have read this from cloudera:
> >>>>
> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
> >>>>
> >>>> The JournalNode daemon is relatively lightweight, so these daemons can
> >>>> reasonably be collocated on machines with other Hadoop daemons, for
> example
> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.)
> so the
> >>>> JournalNodes' local directories can use the reliable local storage on
> those
> >>>> machines.
> >>>> There must be at least three JournalNode daemons, since edit log
> >>>> modifications must be written to a majority of JournalNodes
> >>>>
> >>>> as you can read they recommend to put journalnode daemons with the
> >>>> namenodes, but you say the opposite.??¿?¿??
> >>>>
> >>>>
> >>>> Thanks for your answer,
> >>>>
> >>>> ESGLinux,
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>>>
> >>>>> You need the following:
> >>>>>
> >>>>> - active namenode + zkfc
> >>>>> - standby namenode + zkfc
> >>>>> - pool of journal nodes (odd number, 3 or more)
> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
> >>>>>
> >>>>> As the journal nodes hold the namesystem transactions they should
> not be
> >>>>> co-located with the namenodes in case of failure.  I distribute the
> journal
> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
> says you
> >>>>> could co-locate them on dedicated hosts.
> >>>>>
> >>>>> ZKFC does not monitor the JobTracker.
> >>>>>
> >>>>> Regards,
> >>>>> Craig
> >>>>>
> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> well, If I have understand you I can configure my NN HA cluster this
> >>>>>> way:
> >>>>>>
> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
> >>>>>>
> >>>>>> Is this right?
> >>>>>>
> >>>>>> Thanks in advance,
> >>>>>>
> >>>>>> ESGLinux,
> >>>>>>
> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> There are two different things here: Automatic Failover and Quorum
> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
> Controller,
> >>>>>>> is to manage failovers automatically (based on health checks of
> NNs).
> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
> >>>>>>> storage for namesystem transactions that helps enable HA.
> >>>>>>>
> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
> >>>>>>> reliable HA, preferably on nodes of their own if possible (like you
> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
> >>>>>>> quorum).
> >>>>>>>
> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
> wrote:
> >>>>>>> > Hi all,
> >>>>>>> >
> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
> >>>>>>> > cluster,
> >>>>>>> >
> >>>>>>> > As far as I know, I need at least three nodes to run three
> ZooKeeper
> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
> way:
> >>>>>>> >
> >>>>>>> > - Active NameNode + 1 ZKFC daemon
> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
> >>>>>>> >
> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
> runs
> >>>>>>> > a
> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
> the
> >>>>>>> > third
> >>>>>>> > daemon?
> >>>>>>> >
> >>>>>>> > as I read from this url:
> >>>>>>> >
> >>>>>>> >
> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
> >>>>>>> >
> >>>>>>> > this daemons are only related with NameNodes, (Health monitoring
> -
> >>>>>>> > the ZKFC
> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
> >>>>>>> > command.)
> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
> could
> >>>>>>> > use
> >>>>>>> > another node without any daemon on it...
> >>>>>>> >
> >>>>>>> > Thanks in advance,
> >>>>>>> >
> >>>>>>> > ESGLInux,
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Harsh J
> >>>>>>
> >>>>>>
> >>>>
> >>
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi all,

I´m only testing the new HA feature. I´m not in a production system,

Well, let´s talk about the number of nodes and the ZKFC daemons.

In this url:
https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover

you can read:
If you have configured automatic failover using the ZooKeeper
FailoverController (ZKFC), you must install and start thezkfc daemon on
each of the machines that runs a NameNode.

So, the number of ZKFC daemons are two, but reading this url:

http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper

you can read this:
In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes

I think that to ensure a good HA enviroment (of any kind) you need and odd
number of nodes to avoid split-brain. The problem I see here is that If
ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
(active+standby).

So I´m a bit confussed with this deployment...

Any suggestion?

Thanks in advance for all your answers

Kind regards,

ESGLinux




2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>

> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
> wrote:
> > Hi ESGLinux,
> >
> > In production, you need to run QJM on at least 3 nodes.  You also need
> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> > if you like, though.
>
> Er, this should read "You also need to run ZooKeeper on at least 3
> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
> active NN node and the standby NN node.
>
> Colin
>
> >
> > Of course, none of this is "needed" to set up an example cluster.  If
> > you just want to try something out, you can run everything on the same
> > node if you want.  It depends on what you're trying to do.
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
> >> Thank you for your answer Craig,
> >>
> >> I´m planning my cluster and for now I´m not sure how many machines I
> need;-)
> >>
> >> If I have doubt i´ll what clouder say and If have a problem I have
> where to
> >> ask for explications :-)
> >>
> >> ESGLinux
> >>
> >>
> >>
> >> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>
> >>> OK, I have reliable storage on my datanodes so not an issue for me.  If
> >>> that's what Cloudera recommends then I'm sure it's fine.
> >>>
> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>
> >>>> Hi Craig,
> >>>>
> >>>> I´m a bit confused, I have read this from cloudera:
> >>>>
> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
> >>>>
> >>>> The JournalNode daemon is relatively lightweight, so these daemons can
> >>>> reasonably be collocated on machines with other Hadoop daemons, for
> example
> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.)
> so the
> >>>> JournalNodes' local directories can use the reliable local storage on
> those
> >>>> machines.
> >>>> There must be at least three JournalNode daemons, since edit log
> >>>> modifications must be written to a majority of JournalNodes
> >>>>
> >>>> as you can read they recommend to put journalnode daemons with the
> >>>> namenodes, but you say the opposite.??¿?¿??
> >>>>
> >>>>
> >>>> Thanks for your answer,
> >>>>
> >>>> ESGLinux,
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>>>
> >>>>> You need the following:
> >>>>>
> >>>>> - active namenode + zkfc
> >>>>> - standby namenode + zkfc
> >>>>> - pool of journal nodes (odd number, 3 or more)
> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
> >>>>>
> >>>>> As the journal nodes hold the namesystem transactions they should
> not be
> >>>>> co-located with the namenodes in case of failure.  I distribute the
> journal
> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
> says you
> >>>>> could co-locate them on dedicated hosts.
> >>>>>
> >>>>> ZKFC does not monitor the JobTracker.
> >>>>>
> >>>>> Regards,
> >>>>> Craig
> >>>>>
> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> well, If I have understand you I can configure my NN HA cluster this
> >>>>>> way:
> >>>>>>
> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
> >>>>>>
> >>>>>> Is this right?
> >>>>>>
> >>>>>> Thanks in advance,
> >>>>>>
> >>>>>> ESGLinux,
> >>>>>>
> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> There are two different things here: Automatic Failover and Quorum
> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
> Controller,
> >>>>>>> is to manage failovers automatically (based on health checks of
> NNs).
> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
> >>>>>>> storage for namesystem transactions that helps enable HA.
> >>>>>>>
> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
> >>>>>>> reliable HA, preferably on nodes of their own if possible (like you
> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
> >>>>>>> quorum).
> >>>>>>>
> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
> wrote:
> >>>>>>> > Hi all,
> >>>>>>> >
> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
> >>>>>>> > cluster,
> >>>>>>> >
> >>>>>>> > As far as I know, I need at least three nodes to run three
> ZooKeeper
> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
> way:
> >>>>>>> >
> >>>>>>> > - Active NameNode + 1 ZKFC daemon
> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
> >>>>>>> >
> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
> runs
> >>>>>>> > a
> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
> the
> >>>>>>> > third
> >>>>>>> > daemon?
> >>>>>>> >
> >>>>>>> > as I read from this url:
> >>>>>>> >
> >>>>>>> >
> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
> >>>>>>> >
> >>>>>>> > this daemons are only related with NameNodes, (Health monitoring
> -
> >>>>>>> > the ZKFC
> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
> >>>>>>> > command.)
> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
> could
> >>>>>>> > use
> >>>>>>> > another node without any daemon on it...
> >>>>>>> >
> >>>>>>> > Thanks in advance,
> >>>>>>> >
> >>>>>>> > ESGLInux,
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Harsh J
> >>>>>>
> >>>>>>
> >>>>
> >>
>

Re: question about ZKFC daemon

Posted by ESGLinux <es...@gmail.com>.
Hi all,

I´m only testing the new HA feature. I´m not in a production system,

Well, let´s talk about the number of nodes and the ZKFC daemons.

In this url:
https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover

you can read:
If you have configured automatic failover using the ZooKeeper
FailoverController (ZKFC), you must install and start thezkfc daemon on
each of the machines that runs a NameNode.

So, the number of ZKFC daemons are two, but reading this url:

http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper

you can read this:
In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes

I think that to ensure a good HA enviroment (of any kind) you need and odd
number of nodes to avoid split-brain. The problem I see here is that If
ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
(active+standby).

So I´m a bit confussed with this deployment...

Any suggestion?

Thanks in advance for all your answers

Kind regards,

ESGLinux




2013/1/14 Colin McCabe <cm...@alumni.cmu.edu>

> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu>
> wrote:
> > Hi ESGLinux,
> >
> > In production, you need to run QJM on at least 3 nodes.  You also need
> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> > if you like, though.
>
> Er, this should read "You also need to run ZooKeeper on at least 3
> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
> active NN node and the standby NN node.
>
> Colin
>
> >
> > Of course, none of this is "needed" to set up an example cluster.  If
> > you just want to try something out, you can run everything on the same
> > node if you want.  It depends on what you're trying to do.
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
> >> Thank you for your answer Craig,
> >>
> >> I´m planning my cluster and for now I´m not sure how many machines I
> need;-)
> >>
> >> If I have doubt i´ll what clouder say and If have a problem I have
> where to
> >> ask for explications :-)
> >>
> >> ESGLinux
> >>
> >>
> >>
> >> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>
> >>> OK, I have reliable storage on my datanodes so not an issue for me.  If
> >>> that's what Cloudera recommends then I'm sure it's fine.
> >>>
> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>
> >>>> Hi Craig,
> >>>>
> >>>> I´m a bit confused, I have read this from cloudera:
> >>>>
> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
> >>>>
> >>>> The JournalNode daemon is relatively lightweight, so these daemons can
> >>>> reasonably be collocated on machines with other Hadoop daemons, for
> example
> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.)
> so the
> >>>> JournalNodes' local directories can use the reliable local storage on
> those
> >>>> machines.
> >>>> There must be at least three JournalNode daemons, since edit log
> >>>> modifications must be written to a majority of JournalNodes
> >>>>
> >>>> as you can read they recommend to put journalnode daemons with the
> >>>> namenodes, but you say the opposite.??¿?¿??
> >>>>
> >>>>
> >>>> Thanks for your answer,
> >>>>
> >>>> ESGLinux,
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2012/12/28 Craig Munro <cr...@gmail.com>
> >>>>>
> >>>>> You need the following:
> >>>>>
> >>>>> - active namenode + zkfc
> >>>>> - standby namenode + zkfc
> >>>>> - pool of journal nodes (odd number, 3 or more)
> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
> >>>>>
> >>>>> As the journal nodes hold the namesystem transactions they should
> not be
> >>>>> co-located with the namenodes in case of failure.  I distribute the
> journal
> >>>>> and zookeeper nodes across the hosts running datanodes or as Harsh
> says you
> >>>>> could co-locate them on dedicated hosts.
> >>>>>
> >>>>> ZKFC does not monitor the JobTracker.
> >>>>>
> >>>>> Regards,
> >>>>> Craig
> >>>>>
> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> well, If I have understand you I can configure my NN HA cluster this
> >>>>>> way:
> >>>>>>
> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
> >>>>>>
> >>>>>> Is this right?
> >>>>>>
> >>>>>> Thanks in advance,
> >>>>>>
> >>>>>> ESGLinux,
> >>>>>>
> >>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> There are two different things here: Automatic Failover and Quorum
> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover
> Controller,
> >>>>>>> is to manage failovers automatically (based on health checks of
> NNs).
> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
> >>>>>>> storage for namesystem transactions that helps enable HA.
> >>>>>>>
> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
> >>>>>>> reliable HA, preferably on nodes of their own if possible (like you
> >>>>>>> would for typical ZooKeepers, and you may co-locate with those as
> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
> >>>>>>> quorum).
> >>>>>>>
> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com>
> wrote:
> >>>>>>> > Hi all,
> >>>>>>> >
> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
> >>>>>>> > cluster,
> >>>>>>> >
> >>>>>>> > As far as I know, I need at least three nodes to run three
> ZooKeeper
> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this
> way:
> >>>>>>> >
> >>>>>>> > - Active NameNode + 1 ZKFC daemon
> >>>>>>> > - Standby NameNode + 1 ZKFC daemon
> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
> >>>>>>> >
> >>>>>>> > so the quorum is formed with these three nodes. The nodes that
> runs
> >>>>>>> > a
> >>>>>>> > namenode are right because the ZKFC monitors it, but what does
> the
> >>>>>>> > third
> >>>>>>> > daemon?
> >>>>>>> >
> >>>>>>> > as I read from this url:
> >>>>>>> >
> >>>>>>> >
> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
> >>>>>>> >
> >>>>>>> > this daemons are only related with NameNodes, (Health monitoring
> -
> >>>>>>> > the ZKFC
> >>>>>>> > pings its local NameNode on a periodic basis with a health-check
> >>>>>>> > command.)
> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I
> could
> >>>>>>> > use
> >>>>>>> > another node without any daemon on it...
> >>>>>>> >
> >>>>>>> > Thanks in advance,
> >>>>>>> >
> >>>>>>> > ESGLInux,
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Harsh J
> >>>>>>
> >>>>>>
> >>>>
> >>
>

Re: question about ZKFC daemon

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu> wrote:
> Hi ESGLinux,
>
> In production, you need to run QJM on at least 3 nodes.  You also need
> to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> if you like, though.

Er, this should read "You also need to run ZooKeeper on at least 3
nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
active NN node and the standby NN node.

Colin

>
> Of course, none of this is "needed" to set up an example cluster.  If
> you just want to try something out, you can run everything on the same
> node if you want.  It depends on what you're trying to do.
>
> cheers,
> Colin
>
>
> On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> Thank you for your answer Craig,
>>
>> I´m planning my cluster and for now I´m not sure how many machines I need;-)
>>
>> If I have doubt i´ll what clouder say and If have a problem I have where to
>> ask for explications :-)
>>
>> ESGLinux
>>
>>
>>
>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>
>>> OK, I have reliable storage on my datanodes so not an issue for me.  If
>>> that's what Cloudera recommends then I'm sure it's fine.
>>>
>>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>
>>>> Hi Craig,
>>>>
>>>> I´m a bit confused, I have read this from cloudera:
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>
>>>> The JournalNode daemon is relatively lightweight, so these daemons can
>>>> reasonably be collocated on machines with other Hadoop daemons, for example
>>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.) so the
>>>> JournalNodes' local directories can use the reliable local storage on those
>>>> machines.
>>>> There must be at least three JournalNode daemons, since edit log
>>>> modifications must be written to a majority of JournalNodes
>>>>
>>>> as you can read they recommend to put journalnode daemons with the
>>>> namenodes, but you say the opposite.??¿?¿??
>>>>
>>>>
>>>> Thanks for your answer,
>>>>
>>>> ESGLinux,
>>>>
>>>>
>>>>
>>>>
>>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>>
>>>>> You need the following:
>>>>>
>>>>> - active namenode + zkfc
>>>>> - standby namenode + zkfc
>>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>>
>>>>> As the journal nodes hold the namesystem transactions they should not be
>>>>> co-located with the namenodes in case of failure.  I distribute the journal
>>>>> and zookeeper nodes across the hosts running datanodes or as Harsh says you
>>>>> could co-locate them on dedicated hosts.
>>>>>
>>>>> ZKFC does not monitor the JobTracker.
>>>>>
>>>>> Regards,
>>>>> Craig
>>>>>
>>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> well, If I have understand you I can configure my NN HA cluster this
>>>>>> way:
>>>>>>
>>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>>>
>>>>>> Is this right?
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> ESGLinux,
>>>>>>
>>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> There are two different things here: Automatic Failover and Quorum
>>>>>>> Journal Manager. The former, used via a ZooKeeper Failover Controller,
>>>>>>> is to manage failovers automatically (based on health checks of NNs).
>>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>>>>>>> storage for namesystem transactions that helps enable HA.
>>>>>>>
>>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>>>>>>> reliable HA, preferably on nodes of their own if possible (like you
>>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>>>> quorum).
>>>>>>>
>>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com> wrote:
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>>>> > cluster,
>>>>>>> >
>>>>>>> > As far as I know, I need at least three nodes to run three ZooKeeper
>>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this way:
>>>>>>> >
>>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>>>> >
>>>>>>> > so the quorum is formed with these three nodes. The nodes that runs
>>>>>>> > a
>>>>>>> > namenode are right because the ZKFC monitors it, but what does the
>>>>>>> > third
>>>>>>> > daemon?
>>>>>>> >
>>>>>>> > as I read from this url:
>>>>>>> >
>>>>>>> > https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>>>> >
>>>>>>> > this daemons are only related with NameNodes, (Health monitoring -
>>>>>>> > the ZKFC
>>>>>>> > pings its local NameNode on a periodic basis with a health-check
>>>>>>> > command.)
>>>>>>> > so what does the third ZKFC? I used the jobtracker node but I could
>>>>>>> > use
>>>>>>> > another node without any daemon on it...
>>>>>>> >
>>>>>>> > Thanks in advance,
>>>>>>> >
>>>>>>> > ESGLInux,
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>
>>>>>>
>>>>
>>

Re: question about ZKFC daemon

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu> wrote:
> Hi ESGLinux,
>
> In production, you need to run QJM on at least 3 nodes.  You also need
> to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> if you like, though.

Er, this should read "You also need to run ZooKeeper on at least 3
nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
active NN node and the standby NN node.

Colin

>
> Of course, none of this is "needed" to set up an example cluster.  If
> you just want to try something out, you can run everything on the same
> node if you want.  It depends on what you're trying to do.
>
> cheers,
> Colin
>
>
> On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> Thank you for your answer Craig,
>>
>> I´m planning my cluster and for now I´m not sure how many machines I need;-)
>>
>> If I have doubt i´ll what clouder say and If have a problem I have where to
>> ask for explications :-)
>>
>> ESGLinux
>>
>>
>>
>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>
>>> OK, I have reliable storage on my datanodes so not an issue for me.  If
>>> that's what Cloudera recommends then I'm sure it's fine.
>>>
>>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>
>>>> Hi Craig,
>>>>
>>>> I´m a bit confused, I have read this from cloudera:
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>
>>>> The JournalNode daemon is relatively lightweight, so these daemons can
>>>> reasonably be collocated on machines with other Hadoop daemons, for example
>>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.) so the
>>>> JournalNodes' local directories can use the reliable local storage on those
>>>> machines.
>>>> There must be at least three JournalNode daemons, since edit log
>>>> modifications must be written to a majority of JournalNodes
>>>>
>>>> as you can read they recommend to put journalnode daemons with the
>>>> namenodes, but you say the opposite.??¿?¿??
>>>>
>>>>
>>>> Thanks for your answer,
>>>>
>>>> ESGLinux,
>>>>
>>>>
>>>>
>>>>
>>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>>
>>>>> You need the following:
>>>>>
>>>>> - active namenode + zkfc
>>>>> - standby namenode + zkfc
>>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>>
>>>>> As the journal nodes hold the namesystem transactions they should not be
>>>>> co-located with the namenodes in case of failure.  I distribute the journal
>>>>> and zookeeper nodes across the hosts running datanodes or as Harsh says you
>>>>> could co-locate them on dedicated hosts.
>>>>>
>>>>> ZKFC does not monitor the JobTracker.
>>>>>
>>>>> Regards,
>>>>> Craig
>>>>>
>>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> well, If I have understand you I can configure my NN HA cluster this
>>>>>> way:
>>>>>>
>>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>>>
>>>>>> Is this right?
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> ESGLinux,
>>>>>>
>>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> There are two different things here: Automatic Failover and Quorum
>>>>>>> Journal Manager. The former, used via a ZooKeeper Failover Controller,
>>>>>>> is to manage failovers automatically (based on health checks of NNs).
>>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>>>>>>> storage for namesystem transactions that helps enable HA.
>>>>>>>
>>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>>>>>>> reliable HA, preferably on nodes of their own if possible (like you
>>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>>>> quorum).
>>>>>>>
>>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com> wrote:
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>>>> > cluster,
>>>>>>> >
>>>>>>> > As far as I know, I need at least three nodes to run three ZooKeeper
>>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this way:
>>>>>>> >
>>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>>>> >
>>>>>>> > so the quorum is formed with these three nodes. The nodes that runs
>>>>>>> > a
>>>>>>> > namenode are right because the ZKFC monitors it, but what does the
>>>>>>> > third
>>>>>>> > daemon?
>>>>>>> >
>>>>>>> > as I read from this url:
>>>>>>> >
>>>>>>> > https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>>>> >
>>>>>>> > this daemons are only related with NameNodes, (Health monitoring -
>>>>>>> > the ZKFC
>>>>>>> > pings its local NameNode on a periodic basis with a health-check
>>>>>>> > command.)
>>>>>>> > so what does the third ZKFC? I used the jobtracker node but I could
>>>>>>> > use
>>>>>>> > another node without any daemon on it...
>>>>>>> >
>>>>>>> > Thanks in advance,
>>>>>>> >
>>>>>>> > ESGLInux,
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>
>>>>>>
>>>>
>>

Re: question about ZKFC daemon

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu> wrote:
> Hi ESGLinux,
>
> In production, you need to run QJM on at least 3 nodes.  You also need
> to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> if you like, though.

Er, this should read "You also need to run ZooKeeper on at least 3
nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
active NN node and the standby NN node.

Colin

>
> Of course, none of this is "needed" to set up an example cluster.  If
> you just want to try something out, you can run everything on the same
> node if you want.  It depends on what you're trying to do.
>
> cheers,
> Colin
>
>
> On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> Thank you for your answer Craig,
>>
>> I´m planning my cluster and for now I´m not sure how many machines I need;-)
>>
>> If I have doubt i´ll what clouder say and If have a problem I have where to
>> ask for explications :-)
>>
>> ESGLinux
>>
>>
>>
>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>
>>> OK, I have reliable storage on my datanodes so not an issue for me.  If
>>> that's what Cloudera recommends then I'm sure it's fine.
>>>
>>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>
>>>> Hi Craig,
>>>>
>>>> I´m a bit confused, I have read this from cloudera:
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>
>>>> The JournalNode daemon is relatively lightweight, so these daemons can
>>>> reasonably be collocated on machines with other Hadoop daemons, for example
>>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.) so the
>>>> JournalNodes' local directories can use the reliable local storage on those
>>>> machines.
>>>> There must be at least three JournalNode daemons, since edit log
>>>> modifications must be written to a majority of JournalNodes
>>>>
>>>> as you can read they recommend to put journalnode daemons with the
>>>> namenodes, but you say the opposite.??¿?¿??
>>>>
>>>>
>>>> Thanks for your answer,
>>>>
>>>> ESGLinux,
>>>>
>>>>
>>>>
>>>>
>>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>>
>>>>> You need the following:
>>>>>
>>>>> - active namenode + zkfc
>>>>> - standby namenode + zkfc
>>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>>
>>>>> As the journal nodes hold the namesystem transactions they should not be
>>>>> co-located with the namenodes in case of failure.  I distribute the journal
>>>>> and zookeeper nodes across the hosts running datanodes or as Harsh says you
>>>>> could co-locate them on dedicated hosts.
>>>>>
>>>>> ZKFC does not monitor the JobTracker.
>>>>>
>>>>> Regards,
>>>>> Craig
>>>>>
>>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> well, If I have understand you I can configure my NN HA cluster this
>>>>>> way:
>>>>>>
>>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>>>
>>>>>> Is this right?
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> ESGLinux,
>>>>>>
>>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> There are two different things here: Automatic Failover and Quorum
>>>>>>> Journal Manager. The former, used via a ZooKeeper Failover Controller,
>>>>>>> is to manage failovers automatically (based on health checks of NNs).
>>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>>>>>>> storage for namesystem transactions that helps enable HA.
>>>>>>>
>>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>>>>>>> reliable HA, preferably on nodes of their own if possible (like you
>>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>>>> quorum).
>>>>>>>
>>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com> wrote:
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>>>> > cluster,
>>>>>>> >
>>>>>>> > As far as I know, I need at least three nodes to run three ZooKeeper
>>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this way:
>>>>>>> >
>>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>>>> >
>>>>>>> > so the quorum is formed with these three nodes. The nodes that runs
>>>>>>> > a
>>>>>>> > namenode are right because the ZKFC monitors it, but what does the
>>>>>>> > third
>>>>>>> > daemon?
>>>>>>> >
>>>>>>> > as I read from this url:
>>>>>>> >
>>>>>>> > https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>>>> >
>>>>>>> > this daemons are only related with NameNodes, (Health monitoring -
>>>>>>> > the ZKFC
>>>>>>> > pings its local NameNode on a periodic basis with a health-check
>>>>>>> > command.)
>>>>>>> > so what does the third ZKFC? I used the jobtracker node but I could
>>>>>>> > use
>>>>>>> > another node without any daemon on it...
>>>>>>> >
>>>>>>> > Thanks in advance,
>>>>>>> >
>>>>>>> > ESGLInux,
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>
>>>>>>
>>>>
>>

Re: question about ZKFC daemon

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <cm...@alumni.cmu.edu> wrote:
> Hi ESGLinux,
>
> In production, you need to run QJM on at least 3 nodes.  You also need
> to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> if you like, though.

Er, this should read "You also need to run ZooKeeper on at least 3
nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
active NN node and the standby NN node.

Colin

>
> Of course, none of this is "needed" to set up an example cluster.  If
> you just want to try something out, you can run everything on the same
> node if you want.  It depends on what you're trying to do.
>
> cheers,
> Colin
>
>
> On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <es...@gmail.com> wrote:
>> Thank you for your answer Craig,
>>
>> I´m planning my cluster and for now I´m not sure how many machines I need;-)
>>
>> If I have doubt i´ll what clouder say and If have a problem I have where to
>> ask for explications :-)
>>
>> ESGLinux
>>
>>
>>
>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>
>>> OK, I have reliable storage on my datanodes so not an issue for me.  If
>>> that's what Cloudera recommends then I'm sure it's fine.
>>>
>>> On Dec 28, 2012 10:38 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>
>>>> Hi Craig,
>>>>
>>>> I´m a bit confused, I have read this from cloudera:
>>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
>>>>
>>>> The JournalNode daemon is relatively lightweight, so these daemons can
>>>> reasonably be collocated on machines with other Hadoop daemons, for example
>>>> NameNodes, the JobTracker, or the YARN ResourceManager.
>>>> Cloudera recommends that you deploy the JournalNode daemons on the
>>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.) so the
>>>> JournalNodes' local directories can use the reliable local storage on those
>>>> machines.
>>>> There must be at least three JournalNode daemons, since edit log
>>>> modifications must be written to a majority of JournalNodes
>>>>
>>>> as you can read they recommend to put journalnode daemons with the
>>>> namenodes, but you say the opposite.??¿?¿??
>>>>
>>>>
>>>> Thanks for your answer,
>>>>
>>>> ESGLinux,
>>>>
>>>>
>>>>
>>>>
>>>> 2012/12/28 Craig Munro <cr...@gmail.com>
>>>>>
>>>>> You need the following:
>>>>>
>>>>> - active namenode + zkfc
>>>>> - standby namenode + zkfc
>>>>> - pool of journal nodes (odd number, 3 or more)
>>>>> - pool of zookeeper nodes (odd number, 3 or more)
>>>>>
>>>>> As the journal nodes hold the namesystem transactions they should not be
>>>>> co-located with the namenodes in case of failure.  I distribute the journal
>>>>> and zookeeper nodes across the hosts running datanodes or as Harsh says you
>>>>> could co-locate them on dedicated hosts.
>>>>>
>>>>> ZKFC does not monitor the JobTracker.
>>>>>
>>>>> Regards,
>>>>> Craig
>>>>>
>>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <es...@gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> well, If I have understand you I can configure my NN HA cluster this
>>>>>> way:
>>>>>>
>>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node
>>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node,
>>>>>>
>>>>>> Is this right?
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> ESGLinux,
>>>>>>
>>>>>> 2012/12/27 Harsh J <ha...@cloudera.com>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> There are two different things here: Automatic Failover and Quorum
>>>>>>> Journal Manager. The former, used via a ZooKeeper Failover Controller,
>>>>>>> is to manage failovers automatically (based on health checks of NNs).
>>>>>>> The latter, used via a set of Journal Nodes, is a medium of shared
>>>>>>> storage for namesystem transactions that helps enable HA.
>>>>>>>
>>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes for
>>>>>>> reliable HA, preferably on nodes of their own if possible (like you
>>>>>>> would for typical ZooKeepers, and you may co-locate with those as
>>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK
>>>>>>> quorum).
>>>>>>>
>>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <es...@gmail.com> wrote:
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA
>>>>>>> > cluster,
>>>>>>> >
>>>>>>> > As far as I know, I need at least three nodes to run three ZooKeeper
>>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons this way:
>>>>>>> >
>>>>>>> > - Active NameNode + 1 ZKFC daemon
>>>>>>> > - Standby NameNode + 1 ZKFC daemon
>>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?)
>>>>>>> >
>>>>>>> > so the quorum is formed with these three nodes. The nodes that runs
>>>>>>> > a
>>>>>>> > namenode are right because the ZKFC monitors it, but what does the
>>>>>>> > third
>>>>>>> > daemon?
>>>>>>> >
>>>>>>> > as I read from this url:
>>>>>>> >
>>>>>>> > https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration
>>>>>>> >
>>>>>>> > this daemons are only related with NameNodes, (Health monitoring -
>>>>>>> > the ZKFC
>>>>>>> > pings its local NameNode on a periodic basis with a health-check
>>>>>>> > command.)
>>>>>>> > so what does the third ZKFC? I used the jobtracker node but I could
>>>>>>> > use
>>>>>>> > another node without any daemon on it...
>>>>>>> >
>>>>>>> > Thanks in advance,
>>>>>>> >
>>>>>>> > ESGLInux,
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>
>>>>>>
>>>>
>>